 When software is older than most of the people you're talking to. The title has changed but it was about Zen. That's a different Zen than most of you. Zen comes back anyway. Okay, so that's me. I work at the University of Tsukuba in Japan. And first, since they're paying for this, I have to say something about my sponsors, the Ministry of Education, Culture, Science and Technology of Japan. There are two grant numbers there. You don't need those, but I do. And I will mention that after a decade of trying to get this funded, finally they funded it. I'm not sure what happened. Maybe I just got older or something. What about me? Why am I writing this? Why am I doing this? I've been participating in Python development since about 2005, maybe a little bit before that. If you look up the date on PEP 263, I didn't actually help write that, but I was pretty noisy at the time because I knew a fair amount about internationalization in Unicode. And at that time, getting Unicode into Python was a big deal. They were already on version 2 at that point. 1.6 was the first one that had Unicode in it, but that didn't work very well. And it couldn't be used in the program. You could handle it in I.O., write it to files and things like that, but you couldn't use it in the program, so that was a problem. I don't know everything about FOS or Python for that matter, but I have talked extensively with several of the core developers about the process of building Python and releasing new versions and so on. And that's what I want to talk to you about today. I'm an economics professor. A couple of people have made you stand up to make sure you're awake. People object to you leaving. This is not my problem. As an economics professor, I have very low standards for those who listen to me. They sleep a lot. They get up and they leave in the middle and so on, so feel free. And once again, I teach the University of Tsukuba in Japan, which is, it likes to think of itself as the MIT of Japan, but it isn't really. The first thing I want to talk about is something that I started doing when I started this project. Was it going to be worth doing the project? Specifically, if you look around, you can see there are big proprietary firms out there. Some of them make so much money that their CEOs can spend money on yachts to the extent that you could buy most of the companies of the people in this room. But more of that money goes back into the company, right? So you'd think that proprietary development ought to be able to drive FOS out of the market, especially the volunteer projects like Python, most of the GNU projects and so on. Of course, Linux and several other projects have substantial corporate backing these days, but there are a lot of volunteer projects, a lot of student projects, and aren't those just going to become small as the software industry gets bigger and more well-funded? And the other problem is that the success that we've seen to date with Linux, Python, and these other projects is very much based on the advantages of open source development methodology, seeing the code, the many eyes, modularity that's encouraged by insisting on reusability. But companies can do that, too. They just don't license it the same way that we do. So you would think that for the proprietary business models could probably take this over and neutralize that advantage. What I did was I constructed an economic model, which the details don't matter, but the results look like this. You've seen these exponentials before, and the one on top, the green one, is proprietary software, and the one on the bottom is FOS, and it looks pretty bad, doesn't it? Oh, and the little scary face there is the reason why I'm using Safari instead of Mozilla or Firefox. Well, that's the worst case, and in fact, although it looks like things are getting very, very bad, things are actually getting pretty good. What happens in the model that I constructed is that the ratio of production of proprietary software to FOS stays constant in the long run. It's proportional to the growth of labor force, and this goes back to a model by Robert Sullivan who won the Nobel Prize for this. So we're not going to get driven out of the world. There's always going to be FOS for the rest of the foreseeable future, as far as I can tell, and I think that's good news. So why am I doing this? Well, I'm worried about Japan. I look around the Japanese software industry. In the 1980s and 1990s, there was an alarmist movement among American observers, Edward Feigenbaum and Jordan, about fearing obsolescence because of the famously high quality of Japanese quality insurance. Also, the Japanese were deliberately trying to create so-called knowledge engineering revolution through their fifth-generation project. Well, the first is true to some extent in embedded software and things like that, and the second basically failed. It managed to produce fuzzy things like Japanese input software because Japanese is an insanely complicated language to try to put into the computer, and you also have fuzzy washing machines and things like that, but no knowledge engineering revolution. Knowledge is much too fuzzy, apparently. Today... Oh, I just said that. I should also mention that I've talked to Ruby founder Matsumoto about this stuff going on to the second part of the project, and he sees need for fundamental reform in these industries. So I came to Python, which I'm somewhat familiar with, because I think of it as a fairly successful open source, open community project as opposed to something developed by a company or a small group of people who are very close-knit. The Python community is huge. Python, two years ago, had 2,500 people and they were still taking people in at the door. That's in the U.S. I'm not sure what's going to happen in a couple of months here, it's going to be like, but probably 300, 350, something like that. Still, big communities everywhere, and the worldwide community is absolutely... mine's unmanageable. We can't really think of this as a company or a single organization. So how do you deal with something like that? Or if you're small, but you just want to attract lots of people to come in, educators, for example, they would like to attract teachers and students from all over the world, and those people are not going to be intent on your project, they're going to want it for the six weeks of their module on some sort of hardware work or whatever it might be. So how do we manage a volunteer project, a community project like this? We need process, everybody agrees with that. Software processes go back to the bondage and discipline of the waterfall model, Fred Brooks from at IBM, his Mythical Man Month, the Software Engineering Institute, and various DOD specifications for how you write a plan to write software. In reaction to that idea of process, we have Eric Raymond, among others, with his talk about the many eyes. He didn't invent the term, but he popularized it, I think, and his bizarre organization of software projects. But today, we think more in terms of some sort of semi-formal process, at least agile methodologies, process, yes, but no straight-jackets, please. That's what we're up to, and they're a legion of others that are going on today, but everybody acknowledges that process is needed. So what does Python's process look like? Well, first of all, Python is open-source free software. So again, we have this process-like idea, less process, more product. That's what we want. That's what we think we're getting. In these communities, we have massively distributed development. Every user is a contributor. You need to remember that when you're talking to those people who post insane questions on your mailing list and things like that. These people are potential contributors. You can do something with them. They can help you. You can help them, and you need to learn to manage them in my opinion. That's been my experience. We have open development, and that means two things. One is that every user is a literary critic of your code. They're going to complain about positions of apostrophes, whether you use English or American spelling, whether you use Chinese, simplify your traditional characters, and so on and so forth. This does tend to improve style in your code. People will tell you things, but if you listen, it helps. The other aspect of open development, of course, is that reusability becomes really important. There's a pull factor, which is that people want it reused, and they will say, look, if your software only did this or had this API, I could use it, and that pulls reusability out. There's also the push factor that you want to get your software out there and you want to organize your things so that you have a recombinable set of tools that you can put together easily. Of course, we don't want cargo cult programming, we just grab a few lines and so on. We really want designed-in modularity in the APIs and so on. We do want modules, but we do want reusability. This is something that Matt has emphasized in the Japanese context. There's a real application in social engineering that's coming up, but it's not directly relevant to Python. So, FOSS goes into the enterprise. Early FOSS was FOSS of the hacker by the hacker and for the hacker. Cowboy Geniuses committed megaloc patches and the rest of us fixed thousands of bugs introduced. I mean, one bug per thousand lines, that's pretty high quality code by most standards, but if you put in a million, that's a lot of bugs. Still, enterprises found FOSS to be an attractive method of development. They could use the modules because they were free. Of course, you have copy left versus permissive licensing, but as long as your own business model can adapt to the particular license that's being used for the components you're using, you get it mostly for free. If you want support and things like that, you have to pay extra, of course. But that's much more acceptable in terms of the service model and the transparency of the economics of what you're doing to build your software. So, I have a bunch of examples there. But enterprises also need reliable platforms. They want reliable hardware, they want reliable operating systems, languages, libraries, and frameworks. And this is going to come back later in the demand for backward compatibility, which I'm going to emphasize. So, here we are at Python and finally at the Python trade. I've seen spectacular growth in popularity, probably not due to Eric Raymond pushing it, but he thinks so. There's an obvious process in Python development. You may not be able to write it down. I haven't been able to write it down, but it's very clear that the Python development community, the core community that develops the language itself plus the standard library, has a very strong notion of process. I haven't really been able to talk to people like the Twisted Matrix people or PyPy or some of the other major add-ons, Django. These are all people I'd like to talk to and see if this sort of spills over into their communities as well. In Python, there's a very obvious process going on. I interviewed a bunch of community members, development leaders, Python using project leaders, Python using enterprises. And what I'm missing, as I say, is these framework developers. I really want to talk to them. That's something I really want to do in the future. And one of the things, one of the symptoms that's important is they're the ones that complain most about Python 3. And that developers generally don't have a problem with Python 3. They either migrate or they don't. But framework developers have customers on both sides of the fence. They have migrators and non-migrators and maybe later migrators, and the three different groups have very, very different requirements on the development process that they would like Python to adopt. So I really want to talk to the framework developers. That's an important thing that's missing so far. Who have I talked to in particular? Guido Van Rossum, of course. Nick Coughlin, Martin Van Lewis, who is important because of the Windows connection. Barry Warsaw, who I know through GNU Mailman as well. And their organizers, Steve is... I can't remember any names that aren't Japanese anymore after 25 years there. Steve Holden, not B. Holden is a PSF board member or was a board member, I think, and he organizes conferences like this one for a living as well as doing teaching and consulting. And Jesse, again, I can't remember his name. Was the guy who organized PyCon? Jesse Holden. Yeah. Jesse organized PyCon for two years or three years in a row, and that was a massive job. That's off to him. Application developers basically limited to Barry at the moment. He works on GNU Mailman and his day job for the period when I was talking to him. He was working on LaunchFat. And I talked to Andrew. I want to say, Choi, that that's a different person who is Andrew Chen, maybe, who is CEO or CTO at Continuum Analytics. So those are who I talked to. So what does a Pythonic process look like? How does Python organize things? We can talk about some of the elements. It's hard to talk about how it all fits together because it's all informal. But first there's the BDFL. That's very obvious. You have the benevolent dictator for life, Guido Van Rossum, who makes the final decisions. However, delegation is very important in the Python process. Guido does not make all the decisions. So they have a practice they call a benevolent dictator for one pep or a pep is a particular controversial thing that needs lots of discussion. We have the Zen of Python, a group of sayings about how you should design your program and how you should write your program. We have the peps themselves and we have some module owners as in other projects. Often somebody will own a particular module, but that's not very common in Python. Still, it does happen. A lot of conscious automation I think everybody's aware of automation these days so probably I don't need to say too much about that. But I would like to throw out some of the things that Python has done. And finally there's the issue of channel discipline. How do people communicate with each other? And this is one of the things that at least the Python people think they're very weak on despite having a lot of framework there. So that's something that I think would be interesting to hear a little bit about. The benevolent dictator for life, Guido van Rossem, he's the founder. This kind of thing happens a lot. We have, of course, RMS who is the BDFL for or maybe, well, the BDFL for many different projects, Emacs. To some extent, most of the core GNU projects, GCC, G-Lib C, and so on. He has a lot of influence over those. Larry Wall and Pearl, Matsumoto and Ruby, Theodoratt and OpenBSD, Linus Torvalds and Linux. This seems to be somewhat related to Brooks's theory that there should be one architect of a software system. It seems that it's very difficult for multiple people to actually architect a system. This is one thing that I think Python's delegation system has managed to weaken. Guido is not the only architect in Python. They want to get tired. They do something else, whatever. And that's another place where delegation comes in. One of the things that they do a lot is they establish values. RMS is all about values. Guido brings in a very strong emphasis on backward compatibility and now enterprise readiness. Style. Significant white space. What are you smoking was the reaction of most of the programming language community back in the 90s. But it works. So these kinds of things are something that the founder of the BDFL is responsible for installing and encouraging and sometimes enforcing. Guido himself started Python Development in the late 1980s. The first external releases were in the early 1990s. If I remember correctly, I tried to check all this stuff, but unfortunately wireless at SG doesn't like me very much. He's the ultimate arbitrator of Pythonicity. What does that mean? It means that certain kinds of syntax are non-Pythonic. For example, Lambda is not Pythonic. Guido doesn't like anonymous functions, period. He was willing to do that for one off little tiny expression-like things, for example, extracting a key from a dictionary or something like that so that you can do sorting. But no more than that. There's a legion who would disagree with him, even in the Python community itself. But that's something. He has his ideas about APIs. He says, if you are likely to be writing if you're likely to be writing a particular constant over and over again as an argument to a function, don't do that. Get that argument out of there and name the function. Give it two functions if there are two values for this argument. So if you have a file with append then you should write open append or something like that, rather than open parentheses mode equal append. In fact, with open, he doesn't do that. He follows C there, but that's one of the things that he talks about. Coding style, there's a whole pep about how you write things, how you name things, and so on. Do you use camel case? Do you use underscores? All of that. These things are something that the leader does, and whether you like the style or not it makes Python programs more readable if everybody follows it. You know what you're looking at and you don't get that cognitive dissonance when somebody has a very different style. I don't want to see what CASO would do to a Python program. The other thing about Guido that I think is very important that we should learn from is that he's a skilled delegator. He attracts associates who know more than he does about lots of things. This is something, there was a book in the 70s called Up the Organization and I'm not sure whether Up meant the finger or your own rise in the organization, or maybe both. He wrote the book in Unix Man style before Unix Man actually existed. Each page basically is one topic. So if you ever get a chance to read the book it's good for the reading room. He's put it there, you read it for five minutes, you put it down. Anyway, he said that good leaders can be recognized by the fact that the people around them are good. It's not that the good people make the leader look good, it's that the leader is there and he manages to or she manages to assemble the people who are doing the good job. And then he lets them do their work. This is what delegation is all about. Guido is really good at this. I wish I could tell you how to do it from watching Guido, but I don't know. You know, wink wink, no I just say no more, say no more. One way you can delegate is you can let people own modules. This is sort of the traditional Linux, Lieutenant sort of kind of thing. You have people who own subsystems, they make the decisions, and Linux only very rarely overrules them. When he does it's pretty spectacular and fun to watch if you're interested in fireworks. In Python, the area of responsibility tends to be more self-contained than the subsystems that we often see Linux lieutenants handling. Things like elementary, which is a specific module for XML processing in its fairly low level. It's not a full XML with XPath and all that kind of stuff. It's just the little pieces. TimSort is the workhorse sort function in Python and that one's basically owned by Tim Peters, who also has a lot to do with the floating point stuff, the IEEE 754 standard. Most of these own modules are specific libraries like elementary. TimSort is a built-in function, that's a very exceptional almost nobody owns pieces of the core language and built-in functions. In Python, it's important when you contribute a whole module, it doesn't automatically make you the owner. You have to negotiate that. Typically what Python wants, what Guido has established as the normal practice is you contribute it, you promise to maintain it for a while, but anybody else who finds a bug can go in and fix it without your permission. Extensions can be added if the community agrees without your specific permission. Mostly Guido treats Python as a whole in the same way. The second aspect of delegation is the benevolent dictator for OnePep. Guido often says, I don't have time to deal with this. Nick, you do it. Antoine, you do it. Or sometimes he says, I don't know who should do this. Is there somebody out there who knows about this problem? Usually not the person who's proposing the new feature or whatever it is, but somebody will come up in one case, Brett Cannon raised his hand and said, oh, this happens to me, my PhD thesis. Guido said, okay, you're the BDFL, you're the person who made the decision on this path. And often the BDFL delegate or BDF OneP is heavily involved in the development as well. Sometimes this causes a little bit of personality conflict, but so far most people have trusted this, partly because they trust the people that Guido chooses, and partly because they trust Guido to step in and things get too hairy. Why is this useful? It broadens delegation to developers who aren't comfortable with the responsibility of owning something or maybe they don't want to own anything. And also to aspects of the project which can't really be owned. We do peps on things like changing from subversion to mercurial. And it seems likely that we'll actually also move to get in the more or less near future. That's not clear yet. There are a lot of people who think we should dog food. There are a lot of people who just don't like Git and so on. But from the point of view of recruiting new developers and having people know the tools when they get there, Git has a lot to say for it. Git Hub, Git Lab and so on. Git Lab I guess is going away. It was another one. Gitosis I guess is going away. Git Lab is still going. So those things are attractive automation aspects and Guido is not a fanatic about everything being open. Git Lab is open so that probably where the community will push if we go to Git. Anyway, it broadens the aspect of leadership in the project. We have lots of people who are leaders in the project and they are not jerks about it. There's a protocol for how you do these things. Okay, moving on. The Zen of Python. It's a group of sayings by Tim Peters. You can find out what they are exactly by typing Python dash M this which is a Zen reference. And there are things like there's only one obvious way to do it. Or actually there's one obvious and normally only one obvious way to do things. Which is in contrast to Perl there's always more than one way to do it. So which one do you like? That's up to you. I happen to like the Zen. I happen to like these principles. Not every three line function needs to be built in and a bunch of others better now than never but sometimes better never than right now and things like that. I like these principles. But that doesn't matter really. That's not what makes this work in that is. It matters to me personally but why does Python work so well? Python works so well in part because when you pick up a Python program you know the style already. You don't have to move from Rubens to Picasso or from Bach to the Beatles. Well actually that's not such a long jump from Bach to Philip Glass. It makes it composable. People tend to express themselves in the same way. There's a certain pattern to the APIs and so on. And this makes it much easier to put things, put parts together when you're building a program out of existing pieces out of existing software. The PEPS, formal proposals for changes to the language or module additions to the standard library. Most changes to the language are PEPS. Most evolutionary changes to the Steadlab bar are not. Things that, for example, adding a few new functions to an old module that handles some corner cases or something like that usually are not particularly controversial and they'll sort of get waved through and without a formal pep after some discussion. On the other hand adding a new one, for example one of the big ones recently was Python finally added a num class so you can now have named constants and they will appear in print out from the print function and so on as named constants rather than the integers that are sitting behind them. Things like that. Anybody can write one. The acceptance criteria are basically ITF like rough consensus and running code. You have to have an implementation or it won't be accepted. There are lots of peps get turned down. I forgot to write that but lots of them get turned down. In fact some peps are written specifically to be turned down. Guido says this is the third time I've seen this stupid idea. Let's explain why it's stupid, write it down write a pep and finally reject it. So it's a compact record of controversies the arguments, tro and con and the outcome. This is very useful both for people who are trying to understand what Pythonicity means and also for people who have a specific interest in improving Python and are wondering whether their particular proposal is going to go through or not. If there's something like it that has been refused, if there's something like it that's been sitting and not being improved for a while, that gives you a lot of clue to where things could go in the future. And as I said there's no presumption of acceptance and in fact a lot get refused, some even get refused intentionally written to be refused. This helps to act as a break on some of the more radical ideas and that's not necessarily a good thing. Radical is good sometimes. Python 3 is radical obviously at least in the sense of changing things at the root. And that was a big experiment. At the time Widow made a few remarks that I interpret him thinking of this as an experiment. Backward compatibility is really important but we're sick of the restrictions that we're facing and so we're just going to make a complete break. We're going to keep the good things in Python but we're not going to worry about breaking people who depend on the bad things. And that was what Python 3 was about. That's pretty much what happened. There were some big mistakes made for example when he split bytes and unicode apart and made string the fundamental string type be unicode he took away things like regular expression processing and so on from the bytes. That turns out to be a big mistake because these are basically character streams and they're treated that way especially in web development. So we should have had that we should have had things like starts with and ends with and so on and eventually those things have been brought back in and most recently there was a PEP which was accepted to add percent formatting back in for bytes. So not everything goes perfectly well but there's this idea that there should be a break on radical decisions and make big breaks all at once and that's what happened with Python 3. A lot of conscious automation in the project in particular when they migrated on VCSs they went from CVS to Subversion that had a PEP when they went from Subversion to Mercurial that had a PEP and now they might go to Git again. In fact the Mercurial migration had two. There were three candidates Bizar, Git and Mercurial so we had a face-off in one PEP on the three different VCSs and then when Mercurial was chosen they had to have a plan and they decided to write that up as a PEP. Very formal process in that sense. Have lots of mailing lists issue trackers most of the bugs and simple patches are handled there people who are interested discuss them on the issue tracker and they never appear on the mailing lists. There's an automated review tool called REAPVILG which I believe it's an app engine app that we either wrote or upgraded. Automated testing using build-up and there's a lot of proposals for further automation of continuous integration and things like that. Channel discipline. How do people talk to each other? This is really important. How are we doing for time? I'm over time actually. Courtesy is demanded. Censorship is very rare. There are six levels of lists there. There's the Python list itself which also is gateway to complying Python for users. That includes downstream libraries and framework developers Twisted Matrix, SciPy, NumPy those people are mostly supposed to be talking about their problems in writing Python on the general users list. There's core mentorship which is what it sounds like. New core developers are given advice there. Here's a good bug you could start with, things like that. Python ideas is for developers with undeveloped ideas, bare proposals and things like that that don't have a lot of backing but hey, this is blue sky idea. What do you think? And Python Dev is for fine-tuning the concrete proposals discussion of peps and things like that. Other stuff leaks in but often gets pushed back. Then there's the issue tracker itself and REAPVILB which have their very specialized purposes and finally there are a bunch of SIGS mailing lists for dist-utils and packaging for email and then there's a Python tutor list which is an adjunct to the Python list that has the same kind of function as core mentorship except that it's for general developers in Python. The appropriate level is enforced gently but firmly. Why do application developers and so on like Python language preference is there but it's a welcome in community. Of course it's a big community, lots of busy people and you don't always get a hug when you come in but people don't slap you in the face either and if you're willing to be quiet and explain what your needs are somebody will eventually talk to you and people who experienced programmers of Python who have a need for a new feature in Python usually get a good welcome. Then backward compatibility is a very big feature for most people who are developing in Python. It's a platform for them and they can develop their own features that they need. They don't really need to see the latest in language design features in there. They do want to be able to upgrade and not have their entire system subject to breakage because they want to use generators or something like that in one module and all the others then have to be tweaked for all the little tiny things. So backward compatibility is really important. Batteries included and you can get extra power packs from the Python package index. This is very easy just pip and the name of the module and it's installed for the right version of Python if that's your default otherwise you add a version number and you get the right one. The community puts a lot of effort into maintaining this. Not everybody but there are people who really care about the package index and they take a lot of time effort on that. Finally what's the bottom line? Delegation frees up leader time and strengthens the associates of the community. Collaboration in a diverse dispersed community is just plain hard. There's substantial dissatisfaction in Python with the communication channels we have but there's no consensus on how to improve the situation. So if you're a small community, if you're a new project your mileage is going to vary on the specifics. I don't know what you need and I can't really say Python is a good model for you. The current focus in Python is improving existing tools. Moving to Mailman 3 when we finally release it. Better review tools was good but not good enough. More continuous integration things like that. For platform tools like a language or a major framework or something like that, backward compatibility from early on is really important to acceptance in the enterprise. Guido and others still recall the Boolean MinioFest fiasco with pain. They introduced the Boolean type in a 0.1 release and it broke the world for some people because their tests depended on 0 and 1 coming out and it was coming out as true and false so they were getting exceptions and the whole thing broke. Finally, process automation especially tests helps a lot. Done. Since I have no time you'll have to ask me out of band about Python 3 and that's the end. No more slides. Well, I have more slides but no time. Thank you very much. Any questions? Yeah. The question about sustainability of open source projects you have this model that you said it was the worst case scenario. Yes. From that point of view looking at Python as a case study is not the best case scenario in the sense that it's a programming language and it's a problem. Can you take that? No. What I'm saying is that even though you see the absolute amount of proprietary software growing to be much bigger than that of the false software the proportion stays the same. So false is not going to go exponentially or asymptotically as zero as a fraction of the software's output. There's always going to be a significant fraction of false out there for it. As for Python, Python isn't going to grow that way. Python's going to have an S-scarf just like every other single project or whatever. But Python is looking at another source programming language actually can you infer that much about open source projects in general? It's a very abstract model. It's a very simple macro kind of model. I have quantity of software and software is used to produce software and labor and that's all. So it doesn't really have anything to do with that. I didn't mean about the model. In the case that Denny focused on it. That is focused on it. As far as the stuff I talked about Python some of it's generalizable and some of it's not. But I wanted to throw out in an organized fashion what I've learned about how Python works. What I think makes it succeed and suggests that people look at that and study that for their own projects. And there are a few clear lessons. The delegation lesson I think is very generic. Because you can see it in various places. You can see it in Linux. You can see it not working in Emacs. RMS just won't let go. And he shows up every once in a while and says you can't do that. And he has his reasons which are valid on his grounds but it doesn't help the project to grow. And so delegation is really important. That's about the only thing that I would say everybody should learn is that you need to let go of your project in order to let smart people help you build it. I think we should have an exit go. Definitely. We are stuck in with workshops so I guess people who are interested can also come and corner you. Yes. I'll be here tomorrow. Exit also.