 So in this talk, I'm going to try to cover a little bit of where I want to see the technical policy documents, all of them heading towards. There will be another talk later on, I think it's Friday, where I spend a couple of hours detailing how various bits and pieces of policy fit together, what the relationship with the release management and release critical bugs needs to be, which is different from where it is at right now. Given that, why do we need technical policy in the first place? Technical policy is something that we use to make sure that different parts of the operating system fit together well. So it is required for integration. This, and this is my personal belief, is the one thing that sets Debian apart from most of our sister distributions like Fedora and Gen2, that we have a strong technical policy that we make everybody subscribe to. So anything that helps in this process in the creation, maintenance and adherence to policy improves the Debian operating system. So what is the motivation behind this? Debian policy is, I need to really turn this off, it's very dry reading. It is getting to be large and bloated and unfortunately in trying to be precise about exactly what it is specifying. It is couched in the kind of legalese that tries not to leave any loopholes open, in which it's only a fair success. But at the same time, it robs it of any desire of any rational human being to sit down with policy and actually try to read it. It's not something I want to do on a rainy Sunday afternoon, you know, just curl up with Debian technical policy and spend a couple of hours reading. This is also one of the major obstacles for new maintainers which are trying to get into packaging Debian because instead of actually working with their software and installing it, they have to sit down and make sure that they adhere to what the rules and regulations are. Apart from being written in legalese, the Debian technical policy is only one of the things that people have to read. And we will cover some of this more in detail on Friday. But the policies that packages need to read start from the technical policy and the release critical bug field that we have. On to the other end, there are things that is just simply a good idea to do, like the developers' difference. In the middle, there are various other policy documents which are either well-established and mature, like the Perl policy document or the X-strike force stuff. And then there is the limbo in which our Python policy currently sits in. Unfortunately, if you're writing a Python package rules file, you can't just ignore the Python policy just because it is not as mature or as consistent as some of the other policy documents are. Automated tools like Lintian help a lot. The problem with our current setup is that, like everything else related to policy, Lintian too has grown organically. And some of the checks we don't quite, you're not sure of the provenance. If we could be sure that when Lintian says that this is a serious bug, that means the package should be rejected upfront, it would help a lot. We can't quite do that because there are some things that Lintian talks about aren't really, they don't match what policy is now or used to be. So, we need to have a better idea of how to check our packages against the various policies. An extension of this is if you have sub-projects, the Debian for Kids, they might have other policy documents that are not really relevant or valid for packages in general but are relevant for the packages in that sub-distribution. I can also see derivatives like Ubuntu and Kubuntu and what we are up to about 86 derived distributions from Debian. Any of those guys might have their own local policy. Either they fork Lintian and add a whole bunch of checks or we provide an easier mechanism of overlaying policy documents and policy checks based on the person running the check and their preferences. The other part that is a problem in policy is that, hey, it's written in English. Anybody who follows Brin's uplift Sega knows that English is a horrible language. There's the game of telephone that I used to play when I was a kid. You have a bunch of kids sitting in a circle, you whisper something in the ear of the person next to you and they transmit it along. Eventually it comes back to you and what you hear has absolutely nothing to do with what you started it off with. There is very little redundancy in English so any distortion is amplified. Well, okay, we are not going to do playing telephone policy but still. We have developers coming from various cultures whose mother tongues are not English. English is I think my fourth or fifth language. I don't understand it as well as some of you native people probably do. We have to cater to all these people coming in and perhaps not catching the genesis or some phrase that is used in policy. Thankfully due to the nature of Dubian mailing lists, any mistakes that the policy editor makes is swiftly and painfully brought to their attention. But we still could miss things that the people who read policy mailing lists might not catch. We need to add to policy something that is not as imprecise as a human language which just kind of grew out of organically from nothing. I should mention something really slow and I'm as guilty of this as most of the people on the list because policy is trying to be precise. It's couched in legalese. It brings forth the tendency to nitpick. Any document that is trying to be precise and tries to leave no loophole brings forth in people the desire to poke holes which they still see present in the policy document. Also a few years down the line when the people who originally put in that better policy have long moved on, people wonder what exactly was meant by that phrase in policy. The original motivation and the original goal have often been long forgotten. Sure, we can add in little footnotes that talk about what the motivation and the goal and the rationale for the policy piece is. But that again suffers from interpretation. Even if you share the same language and English is your native tongue, there can be multiple interpretations of the same stanza in policy. This is not speaking hypothetically. If you just go back over the archives and you'll find various people debating someone like our ex president Clinton, what the meaning of the word is, is, which would explain that stanza in policy. So what am I proposing right now? My tentative proposal is that we add along with any policy proposal, something in pseudocode that would either define what we are trying to do or add a check to the package. Of course, not every policy proposal is about packaging and therefore it can't have a check. It might be too hard to do offhand, and we don't really want to add pearl to the technical policy. I would like to, but then I will get jumped on by people who happen to like Python or something equally horrible. So I am suggesting that we take some kind of pseudocode, which is language neutral, just defines what the check for that policy segment should be. Here is a bit that I grabbed out of the MD5 sums check from the Indian. So you can see if the control file is missing, there is a certain tag that is attached and we return that. If the control file exists but is empty and the tag is returned, we parse the control file and tag if there is an error. Well, we first read it and see if we can read it. Then we parse it and the same thing with the MD5 sum file. This is, I think, about 80% of all the checks that we actually performed on MD5 sums. I ran out of space at the bottom of my slide. There was one more line that I needed to add. I think that even if you don't really understand English all that well, this should be fairly clear what the check is trying to do. There is very little ambiguity here. Policy process is already seen to be too hard and takes too much time. And I think there is some merit in those complaints. There have been a certain lack of motivation, I guess, because very few people who are actually on the policy team have actually been committing any changes. That has luckily started to change. Russ is here. He has been very into the Aztec. He already has two policy changes already and waiting to go in and we should have a new version of policy out for you guys within, I don't know, couple of weeks after getting back from Edinburgh. I don't think it actually adds that much of a burden on people who are proposing policy changes. Technical policy is not meant to be changed on a whim. Policy changes have the potential of affecting every single developer and every single package in the archive. Making people stop and think and come up with pseudocode shouldn't be that much of a burden. Surely, even if somebody is not a programmer, it shouldn't be that hard to find a bunch of developers to second your proposal who could create pseudocode for you. So I don't think it's going to slow the process down very much more than what we already have. Not that's saying that much, but still. We already covered this in the starting. Why do we need this? Because the diagnostic code that is issued based on the pseudocode that was proposed and accepted as part of policy creation. You can't go and argue with the LinkedIn author that, hey, you got it wrong. Just not what policy says. Because policy would indeed say that in pseudocode. This is the other fun thing. We are now beginning to get out of my base proposal into the kind of fun things we can do. Policy files are written in currently in SGML. I am going to propose that we move to something like DocBook XML. This is a far better understood format. There are all kinds of conversions out of DocBook format into any other printable format. It has far better HTML conversions. And there are gazillions of people working on it as opposed to the one or two people who work on DebianDoc currently. With that, we would also gain the ability of using an accessibility transform to just grab the pseudocode out of the technical policy. So any time somebody is trying to implement a new policy checker or to implement checks for a new policy document, you just run the accessibility transform and bingo, you just have pseudocode with comments in there leading back to the section of policy the pseudocode came from. I think it would make the life of the LinkedIn and Linda authors simpler if they had this tool to automatically extract and then diff different versions of policy. We can go even further and this is an idea that I've been playing with for the last couple of months. It's one thing to just automatically extract pseudocode from a bunch of files. If we can design our pseudocode, actually I was thinking more in terms of let's not think of it in terms of a check as in a programming language check written in pseudocode. Think of it as a directive written in a more precise language than English about what the policy stanza actually means. What I'm proposing is that in the long term we come up with an ontology of policy, a policy definition ontology that defines terms. We already have a database of LinkedIn and Linda checks which we can use to create the taxonomy of the verbs and objects that most policy checks are used to dealing with. Adding the rules to the ontology shouldn't be that hard a process. Especially all these newfangled young whippers snappers coming out of grad schools with PSDs and understanding of ontologies and rules and references and all these things that old-timer like me has to scratch up on my own. These guys should find this a nice practical use of all the theoretical computer language stuff that they have been dealing with. And I think it would be interesting research project really because none of the papers that I read about ontologies, they promise a whole lot, but none of them has any practical applications. At this point I'm kind of done with what my proposal is. I'm waiting for you guys to come up and poke holes and tell me why this won't work. The benefit of some of these things is that if we do rewrite policy as I'm going to suggest we do in a couple of days, we can go back, have all our film wars about what the old bits and pieces in policy actually mean, and start out with a clean slate for learning where we can actually rely on the technical policy being clear, and we should not need a release manager maintain list of release critical bugs. So one of the things that I've found the most difficult in figuring out how to program checks into LinkedIn is the problem of that in order to, there's a lot of things where it's very difficult to check exactly what policy dictates. You can instead check for something that's slightly more general than what policy dictates and tell people that if that isn't, you know, if they are fall into one of the exception cases where the check fires, but policy says that what they're doing is fine, then they can use an override. That's a little bit, that might potentially be hard to deal with in this framework where the pseudocode for the check is policy itself. Why do we need to write policy that requires exceptions? Do you think it would be possible to go back, rethink what we want policy to be, and create policy that reflects the checks that we can actually perform? Can you come up with a concrete example you're thinking of? I'm trying to think of some concrete examples. Most of the concrete examples that I can come up with are not directly policy driven. They're driven by indirect things like, for example, the check to see whether or not an executable is either a script or an elf binary has some intriguing false positives with some things like Caleb C and some of the other bits that are doing strange things with the dynamic loader. It's usually that there's one or two packages or there's one or two systems that end up being exceptions to different rules in order to work around some problem or because they're providing the base for something or, you know, Lib C is an exception to several different Lintian checks because it doesn't function like a regular package. It has statically linked binaries, for example. Well, one of the things that we probably will do is each one of when we extract the pseudocode, we are probably going to grab in a chunk of policy itself which will go in as comment. And we are unlikely to get to the point of an ontology and automatic generation of checks anytime soon. But the extraction part, proposing at least a partial solution, a partial check with comments on top about what the exceptions might be, still make it easier for the policy or package checking software authors to... It is out there in policy itself what the exceptions are likely to be. And if we make the comments informative rather than normative, you can just add on any other comments that might be needed as fed back to us by the policy checker writers. I think we need a microphone. Could you come up to the mic, please? Excuse me. Maybe I don't have an understanding of what you said, but I'm not sure that you have defined what is inside the pseudocode language. And in fact, I would like to know if you think of something more formal like, you know, Z notation or language B or with tools like Coq and everything like that. But I ask you that because it's not only pseudocode in this case, because it's also, what could you say, a theory prover. So it means that at the same time that you will produce tests for Lin-Chan using this kind of language, you will also try to deduce if every piece of the policy is not contradictory. So I'm not sure it can fit your needs, but I want just to know if you have thought of this kind of thing for your pseudocode. I'm perfectly open to whatever tools that I can use to get this working. And actually you should ask Russ whether he is amenable to this because he's one of the people who is involved with Lin-Chan. So the difficulty might lie in the translation. How easy would it be to take those languages and formal theorem-proving tools and convert them into something that can run on a dev file or something and see if the package actually conforms. The situation that I'm in with this general idea is that, you know, I completely agree that it's theoretically possible. There's nothing that should be preventing us from being able to do this. I personally don't know enough about theorem-proving, about the languages that people have developed to do it and about the research that's been done in that area, which is pretty comprehensive, to be able to comment really in any comprehensive way on what tools and what not would be used. I mean, I know enough to know that you can do this kind of thing. I don't know enough to know how. So that's where it would be really nice if someone who has had an interest in that particular area was interested in trying to implement something like this. I'd certainly be more than happy to explain the internals of Lin-Chan, the motivation behind the check system, you know, some of the history and the like and see where we can find a common ground. I mean the same boat you are. I have been reading papers about this. I have no hands-on knowledge on how to do this. I am hoping that people who are interested in using these tools will come and help us accomplish that, you know, the wonders of free software. The other interesting checking problem that Lin-Chan in particular runs into is that Lin-Chan is a static checker which attempts to not assume anything about the trustworthiness of the package that it's checking, which certainly, among other things, means that it can't execute anything that is part of that package. Policy frequently has run time, has statements about what the package should do when it is executed. For example, a NIT script output is dictated in policy. Some things about NIT script exit status. There are other examples. So it's hard, you can describe a check, but if the check involves running the package in order to see if it does the right thing, it can be difficult to actually do that check. Actually, I think we should take offline. This is an issue about a test and evaluation environment. I think it might be interesting to see if Lin-Chan can run its own little virtual machine and throw it away after it's done. Anybody? Since this is another one of the dreaded last slot before mealtime and I think I'm pretty close to being out of time, I guess this is it, unless somebody has any other questions or comments on policy in general or how obscure policy is or what didn't feel like getting changed in policy. I have a question on the process of the policy. It used to be much more discussion-centric. We discussed issues on the list and when we identified some kind of consensus, we accepted it and you simply edited the policies. That's a point of view I think you're defending. But I have the feeling nowadays that we have less and less discussions like that and the policy process is kind of broken due to that. Don't you think that you should be more proactive in pushing forward new ideas? If you go back to the last film war we had about policies and the undelegation and the redelegation of the policy group that happened last year, there was a proposal about moving it forward. Unfortunately, even with the new proposal, not much change is going to happen unless you see motivated people coming in and helping create policy. There is an effort for the policy now that we are delegates again that we can actually take action on our own. I will be talking more about this on Friday, I believe, about the new policy process. I'm open to any suggestions on how people think that we can improve the process without sacrificing quality and the offer that I made last year still stands. If enough people feel that my guidance of the policy process is being hindered by my presence and the policy process could improve, I still offer to stand down and let somebody else take the process over. So if you are interested in the policy process, I strongly suggest you come into the workshop that we have on Friday which lasts for two hours and is meant to hash out the current state of policy and relationship between policy documents and the process. Being fairly new to the policy process and looking at the existing bug list and the like, my feeling on it was about the same as my feeling on several other packages where I've started coming into an existing maintenance group which is that a lot of people have not had a lot of time, there's been turnover and who's been actually working on it and as a result there's a large bug log. Similar to many other packages, I think one of the first places to start is going to be triage. There's a bunch of stuff in there that's been discussed, there's a bunch of stuff in there that hasn't been discussed, there's a bunch of stuff in there that's completely untagged, there's a bunch of stuff in there that's tagged wrong. Even whatever tagging that you want to use, the old policy process, the BTS tags, whatever, and the first thing to do probably is just find people who have the time and the interest to hash out a process whereby we can go tag everything and then just go tag everything and figure out where we actually stand. There's a ton of open bugs and it's kind of intimidating to look at that and go, I don't really know what to do with any of these because if we can triage them then people can start committing them and then we have for progress, for progress for Osmore contributors and you get a positive reinforcing cycle. I think we are kind of out of time, but if people want to continue this discussion, we can huddle around after the... I guess this is it for the time being. We'll talk more on Friday.