 So, the next talk is Petter Rennoltsen talking about upgrading a preconfigured CDD. You probably know by now I'm involved in the DVDU project and we have been struggling a bit with the upgradability. I thought I'd just sort of informally talk about... Is the microphone working? I don't know. Yeah, it's working. It has to be very close. Yeah. We aren't able to extend it up a bit. I think we're going to keep it up. I think we're going to do like this. Okay, so we have been struggling with making a customized Debian distribution for some years now and discovered a few problems. I wanted to share some of these observations with the rest of the project. So hopefully we can improve the situation for both DB and EDU and all the other custom Debian distributions and probably the rest of the user base in Debian as well. Some people in Debian seem to believe that Debian is seamlessly upgradable and is working perfectly for all the users when they want to upgrade Debian. It's no problem. They can do it for hundreds and thousands of machines with no issues at all. All you need to do at the end of the upgrade is to test it properly and then it is okay. I do not share this group. I'll go through what I mean with a seamless upgrade. I'll talk about some of the issues I've discovered with upgrading packages with no local configuration. We use the configuration with server configuration. I'll run through a few proposed solutions and then give a simple example of the implementation of the solutions of a random-picked packet. So seamless upgrades. If I can install a packet, add configuration to it and upgrade it while keeping the configuration without any interactive fuss, then it is a seamless upgrade. The non-interactability is really important. If you want to upgrade 2,000 machines, you do not want to answer stupid depth-configure questions if you want to keep the old renew file. You want to keep the old configuration and the packet should be working as it was before the upgrade. If this is not the case, the upgrade is not seamless. There are a few problems in Debian at the moment. There used to be more problems, so we are improving, so it's not like everything is bad. Let's just give up and talk. It's more like we are not doing very well and we should improve and we are moving in the right direction. It was possible to pull the boat in the right direction. To give a simple example, this is from an earlier run of the testing upgrade script I made, I guess, one and a half year ago. It was a very simple script, made a change route, installed a packet, fixed the sources of list entries and did the upgrade from Goody to Sarge. And this is like the basic upgrade. If you install a packet, you don't touch its configuration files and you upgrade it. This should work out of the box, no question asked, and everything should be working just like it was before the upgrade. This was not the case if you installed XDVI. XDVI depended on t-text base and t-text base should be changing its own configuration files after installation and then get really confused when you try to upgrade it. And if you run into these kind of questions for configuration you didn't change and you have no idea if you should keep the old or new one, just imagine. Well, try to explain to your grandmother over the phone how she should fix the configuration to make sure it keeps working. And that's a good test for packet maintainers and software developers. If you could explain to your grandmother over the phone how to fix a problem, then your software is very good. If you can't, you're creating problems for the support call centers all over the road. And in this case, I think the pre-inst, post-inst scripts of the packet had generated this file and they suddenly decided to make it a conf file. And the scripts didn't really handle upgrades very well, so ended up with this stupid, stupid error. And of course the installation and upgrade was running in non-interactive mode, so it was impossible to get information from the standard in and the installation actually failed. So then you have the other issue with user configuration. This is when the packet has been installed, the user have run it in one version and done some configuration of it, and then you upgrade the packet and you want to keep your user configuration. The user configuration is stored in the user's directory, so the suit administrator should stay away from it. And this doesn't really work very well. KDE is a random example because I use KDE a lot and we use it as the default desktop in Deviant.edu. And I also run it on the laptop of my parents, which I remotely administrate over the net. And upgrading from Woody to Sarge on my parents' machine confused them quite a lot. The trusty old logout button, which they have been using all the time to leave the machine alone and turn it off, was gone. When you moved from Woody to Sarge, the logout button and the lock screen button was missing. It didn't appear anymore on the K panel. So I had to explain them where to find it in the K menu and confused them a bit because there was some problems with translations as well. But moving buttons around actually confuses non-technically skilled people. Is it really painful? Even worse was that the K panel, the list of applications available as buttons at the lower part of the normal KDE installation, they broke. Because the content of this panel was written in a configuration file in the user's home directory. And it was pointing to files in the user's share somewhere, the desktop files of the Woody installation. And these files changed name were removed to some other location or were just removed and the K panel buttons stopped working. I think of approximately 10 buttons, only two survived the upgrade. And then you have the problem with larger installations where you have different versions of KDE, or almost any application, but let's pick KDE here. If you have half of the machines upgraded to KDE 3 and the rest of them running KDE 2, and the user can log into any of the machines and they will. So they will move from a KDE installation version 2 to a version 3 and back to a version 2 and back to version 3. And KDE gets really confused. It's kind of capable of handling upgrades. Sometimes it will convert the configuration to newer version formats. But it's very rarely capable of handling downgrades where it's supposed to know the future location of a file and patch configuration from there. This does not work at all. And a third problem I've seen, upgrading server configuration. There at least the syspad administrator can cope with the problems, but when you have a lot of them you don't really have the time to do it. Squid is my randomly picked example. I will run a more detailed example later on. In db and edu we want to set a few values. We want to increase the maximum object size because we use Squid to proxy our cache Debian packages. We want to refresh the at release packages file a bit differently from the rest of the files it is caching. And we want to add access to the local network. And we want to add six lines to the configuration file. And we would love this to survive package upgrades with no questions asked. We can do that because we have to modify the file in EDC. And during upgrades if the maintainer modified the file, the default file, depthconf or dpackage will ask, do you want to keep the old or new file? And we don't really want to keep the old file and we don't want to use the new file because we want to keep the old configuration and get all the new defaults. That is not an option when you run dpackage and try to upgrade configuration. But I think the situation is like a merge if you have different branches. And in this situation you have a branch of your own configuration and the branch of the new package version. And if you can't merge that easy, somebody has to decide which part of the configuration should be included from which branch. Right. Thomas Lange from the FI project mentioned that this is a merge problem where you want to merge configuration from the old and new one. That's absolutely correct. That's what you want to do. I've actually ridden from interactive merging. It's been in the BTS for the last two and a half years. And I think it's finally getting applied. But I will tell you later on, I don't think merging the one configuration file is the correct option. I think that's a workaround for a bug in the configuration system of several programs. So, one other proposed solution. How many of you know what a hidden-debt-conference actually is? Raise your hand. One, two, three, four. On the screen there is an example of how to implement a hidden-debt-conference question. Some people complain that I don't want depth-conference questions. There shouldn't be any more questions asked during installation. Which is a valid argument, but it's not applicable to hidden-debt-conference questions. There is no question asked to the user during installation. That's why it's hidden. And some people claim that no, I don't want to make it configurable. One size fits all. Well, that's obviously flawed because we want another configuration and we do exist. So, someone else wants another configuration file. So, please make it possible for us to do it in a sensible way. And a shared way, because all the custom-debt-conference questions want to configure several of the same packages. If we can do it one way with one packet, all the custom-debt-conference questions will gain from this. So, to run through the example, you make a template. And normally, it's a good idea to document that it should not be translated because the translators are very active in finding text to translate. And if you don't tell them that they should stay away from this template, they will file a bug report asking you to make it translatable. So, I'm reading sure that do not translate this text. The template is forbidden and although it's not just translate your line of text. Well, it's not supposed to be translated at all. Well, they will read the template part and find some text that is not translatable and complained. So, you have to write explicitly that this is not supposed to be translated. It's not about that it's not marked as a translatable text. So, you have a template. It contains a template name and a type of a default. And you have a config script. And this is kind of the important part. Some people get it wrong. They need to track if there is some existing configuration and use that value instead of the depth code value. You cannot trust the depth code values between different runs of the configuration script or the maintainer script as they call it. So, if there is a configuration file, you need to use the value in the configuration file. If there is no configuration file, that's when you use the hidden depth code question and the preceding. So, in third urshell script, if config file exists, read the configuration and set the depth code value to the current value of the configuration file. And then, later on, the post-inscript will run. And it's two similar things. If the configuration exists, get the current value. And then you will fetch the current value from the depth database as well. And if those two are not equal, then you are supposed to change the configuration file, so you update the configuration file and you're done. This makes it possible for me and all the other custom delian distributors to enable this option, which is completely hidden from normal installation processes in a way that we can... I'll get a packet configured at the time where you want it. Of course, this is not solving the upgrade problem. It's solving the installation problem. But if you make sure you actually track and handle the configuration you set in the post-inscript during upgrades, you have also solved the upgrade problem. So, on to the second proposed solution. As I said, the goal is to keep local configuration during upgrades. And for this to work properly, it's very important to not change the configuration file format between upgrades or between versions. If you do, you need to convert the old format to the new format and you will discover a lot of interesting problems in the process. So it's a lot easier if you don't change the format of the configuration file. And the easiest way to make this happen is to make sure that the site or host configuration is separate from the packet default configuration. And you also want to make sure you have more than two locations for the configuration because there are different groups and people that want to have a say in the configuration of a packet. At the university, we have the packet author, which gets to have a say in the default configuration. Then you have the local or the university UNIX administrators group or part of the group. We want to provide a good default configuration for the university. Then you have the local system administrators at the different departments. They want to do some minor overrides as well. And if you're really lucky, you have the machine owner that want to have a say in the configuration of his machine as well. So if you read in configuration from all, from at least four different files and merge the result together, you will make all of us happy without making us, like having to step on each other's toes to get the configuration we want. And this is just one proposed example. Read configuration from user share through config, which is included in the packet. Then you have some site, which is read as well. I'm providing site share through config. This is where the university global group would put their configuration. And then you have EDC through config, which is where the host administrator would put his config. If it's a user application, you probably won't have user-specific configuration as well, which would be in users home directory dot through config. And then some applications you want to provide fixed overrides. Say the university administrators want to make sure that all the users of a given program is using the web proxy. There's a browser and you want them to use the proxy. So you specify this in the site share through config.fix, which would be the location for the university-wide configuration. Or maybe the machine owner want them to use his proxy, and this proxy will talk to the university proxy, so he would put an override in the EDC through config.fix. And then of course, in the very, very, very rare case, you want to have a fixed configuration on a packet-wide level. I can't imagine when that would be useful, but then you would put something in user share through config.fix. And this makes sure all the groups that want to have a same configuration have their own file, and they can put the overrides they want in their file and not run into conflicts with the other groups' files. And to apply this to Squid, if the Squid packet or the Squid program would read the default configuration from, for example, user sharedocsquid.conf, that's just a random example from the packet. That's going to be default configuration. And then you would look for the site EDC Squid.conf file and read overrides from there, and then finally EDC Squid.conf and get the host-specific configuration. And the post-inscript could, for example, ask for some common useful values, like the maximum object size or the subnet that is allowed to connect to the Squid server and then add that to EDC Squid.conf. And make sure those values survive upgrades. And for those of us that need to provide a more complex configuration, we could add a file, you know, or the other packages that the file needs, site EDC Squid.conf for some similar location. Of course, in Squid there is a painful limitation of the file format, you cannot. There is not a level of indirection when it's talking about subnets. So, for example, we want to make a Squid configuration that works across any installation where you can have a list of IP subnet mappings and only provide a symbolic name in the Squid.conf file. So when you have to trade with a subnet of, there will be a new installation, you only have to update one or two files instead of, as it is at the moment, where you have to update ten files to make everything correct. But that's a minor issue. So, that was my talk and I would like to have a discussion on how we can convince the rest of upstream to realize this is a good idea and hopefully make sure all the devil factors are upgradable in a proper fashion without requiring manual work to keep things working. Any questions? Do you think that there are problems with config files where the ordering of the lines are important? I don't think I understand the question. For example, if you have an HTTP config, the ordering of the lines in the config files are very important. And for example, in a d.conf, you order the config files whatever you want and I think with certain config files, how to keep track of the syntactically correct ordering of the things? Yeah, the Apache config file is a really good example of a configuration file that is not easily upgradable. I don't have any good proposals to fix that because the ordering is so fixed and if it wants to read several files and you can't just include them at any location, so Apache configuration file is a bad configuration file for much longer upgrades. The Apache 2 configuration is much more about in tiny little pieces. It is better for your plan or it is the same as before. You mean the Debian? Right. The .d directory is a step in the right direction and if you are lucky with the format, you can do complete multi-level configuration of the box. But if the format is very depending on ordering and if you can't override some values if they are set already and some they will override if they appear later in the file, I think that's the case with Apache that sometimes the first value takes effect, sometimes the later value takes effect and there is different values. It's a very good example of how it should not be done. I believe that Apache is a good example of where you have to maintain and figure out what should be configurable and what should not and then push him to make those things configurable that you need for the CDD. Instead of trying to make upstream design it a better way, I try to convince Apache to use another configuration structure. So we are talking about multi-level configuration. I believe that this is a situation where instead of trying to go by Apache's way of doing it then you should instead use the tools they provide in Debian package. Yeah, that's... Should I repeat the question? No? It wasn't that much of a question. He made the observation that in the Apache case it's easier to provide overrides in Debian than to try to convince the Apache group to do it a better way and I tend to agree that given the current state of affairs it's the best of all the possible options to actually get something done. And of course at the university we have our own configuration file which includes several files in the correct or in several locations in the Apache configuration file so you can actually provide hooks that way. And it... Well, we hope it will work across our threads. We'll see. Yes? So the posts here are so complex and wide that I think the only way to get into the point where all of the upstream packages will work that way is to have a library or two that would implement these features. They should have just a more configuration file passes, of course, and then support for multiple-level configuration files. I don't think it's even possible to try and convince most of the upstream maintainers to change their code to have even two levels of configuration file orders. It should be some library. Does anyone know of these kinds of libraries? I agree that it would be a lot more convenient for developers if they could upload that problem to some library and get it right without having to figure out the solutions on their own every time. On the other hand, there are existing projects that do this to proper rate. KDE is a very good example where you provide an environment variable with a list of directories it's supposed to read configuration from. And in that case, we, in distribution videos, make a sub-directory and fill it with .kd to it and it works flawlessly. So it's possible that it's not rocket science. It's been well known for probably like 20 years how to do this properly. But a lot of developers do not know how to do it and they don't really realize the problem, I think, and they would really like a better way to make it work with their packet. On the other hand, I don't think we can hope to convince everyone to use one through the file format and just getting upstream maintainers to modify their packet in a way they don't understand will take some time. On the other hand, I have had a very good experience talking to upstream developers explaining the problem and why I want multi-level configuration and get them to change the packet in a way that keep it backward compatible with the current installation base and also work for multi-level configuration. So it's not that hard. And it's not that many packages either. We have approximately 1,000 packages in GVU and the files we actually need to configure is less than 20. So it's not a really horrible problem. Most packages, most programs do not have configuration. That's the lucky situation. There's very few packages we need to fix. But some of them are really hard. I think it would be very good if there are some tools or a little framework for supporting the config.fixed file or several locations. For example, for a simple configuration script that has shell syntax and such a framework or if there is a tool where you can see there is a list of directories and please look for the configuration files and we also like to support the .fixed configuration. That would be very nice. Even in the FRI project, we have currently the problem that people want to use different configuration directories or configuration files and now we have to add this feature to it. Let me this is a shell implementation of multi-level configuration in popularity context. This is the multi-level configuration part. But is the fixed file? Well, if I wanted a fixed setting, we could add another line. It doesn't make sense in this case. Or you would also only add it here. Voila! It's not very complex. It's hard. But you have to realize there is a problem and know that this is how to fix it. Except that this is the last script or actually a shell script. It's a shell script, yes. So it's easier with this one than the CSE program. And I guess that the fact that KDE programs works well is just because they have libraries which are very coherent. I agree, yes. And having a C library for C programs and a third library for third programs and yes, Python library for Python. Well, for all languages having some recommended way to handle configuration files and making this well known to all the developers of free software written yet. And this is just to show that it's not very hard to do using a script language and how it includes. What about configuration where the number of lines, for example, you have the ACL lines for your script configuration. Can this be used with preceding because sometimes you only want to add two ACL lines, you want to add 20 lines of ACL definitions. Can that be easily done with preceding? Well, I don't know what an easily part but I'm pretty sure it can be done with preceding, yes. Because as far as I understand the DEB Conf is always one question, one answer. No, it's not. You can register all the lines for some questions. You can ask, you can precede it too. But it's sort of ugly. Yeah. I suspect that in the complex case it might be easier to have a DEB Conf answer to provide the other location of configuration files to read. And then we could just point Squid to Overfile and be done with it. But there is several ways to do it and some of them are more easy to do for simple things and some of them are more easy to do for more complex things. Are you not using CF Engine for this cases? We do use CF Engine for the cases but that normally breaks up credibility because the packages do not really keep the configuration. So if you rewrite the config file you end up with DEB Conf function during upgrade. And that's the one I want to avoid because when we do as I maintain 1,800 Unix machines we don't want to answer the issue. But I guess we have the same problem with FI and massive upgrades as I asked in your session that even though we have great systems like FI to do mass deployment we still have to deal with this upgrade problem and at the moment even with FI and 4 we have to deal with this upgrade problem seems to me on the short term to be unable to solve. Yeah, well there is a worker on being worked on in Spain and the customer that went through developers there is working on a kind of elegant hack where he would just before the just after the package is installed he will make a copy of the original configuration file that stories over basically my file and then just before upgrades he will take away my file and put it somewhere else and put back the original file and upgrade the packet and then switch. So it kind of works but it's not the most elegant solution I've seen to that problem and you have to keep track of the format to make sure it doesn't change and you will lose the new default values and all those kind of problems it's a worker on that will solve some of the problems with a lot of problems. The worker on that we use as FI users is wipe and reinstall for upgrades. Which is a commonly used solution in the field of computer science but I would rather have an easier way for my grandmother to upgrade her server than to reinstall. I wonder if this, because you mentioned what Sergio is doing in Spain what is the better solution for the problem? You told us about the problem but we have no real solution for other problems if I understand your talk. We, unless we get all upstream to do the right thing we don't have any real solution. Yeah I think if we want to change the way configuration files in programs in Debian behave we want upstream to be in on it because one thing is fixing it in Debian, another thing is to make sure the description of the books explaining how to use the software in the free world invalid. Because if you buy a book on Apache you expect Apache configuration to work a certain way and if it doesn't in Debian a lot of people will be very surprised. So the ways at first we take all the world domination with Debian and make upstream use Debian and so they will be careful and ready. No I don't think that's required as I said my experience with upstream is that they are actually very skilled they understand the problem when you explain it to them and they are willing to find a solution. So I think for four or five packages I've been able to get upstream to fix the behavior of the program. But we have more than 10,000 applications and we have 1,000 developers so if everyone knows about the problem and talk about top with the upstream we need to have a better chance to fix it. But is it probably the use of that consistently as well or is that pretty much all now? What do you mean consistently? There seem to be an increasing number of developers in Debian that realize that Debian Comp is the best way to understand questions during installation time and make it so that these can be pre-seeded. But there are of course some misconceptions about what Debian Comp actually is and how it works and like the argument you shouldn't have in Debian Comp's questions because the user shouldn't be confused during installation that's just a misunderstanding and I hope to get rid of that misunderstanding sometime in the future but there are still people that believe it and there are also people that believe that Debian Comp is probably perfect for everyone so you shouldn't be able to provide the entity server at install time you should use the the pool of entity.org DNS name instead and it will take a while before we can convince them otherwise. That would be a good idea to implement something which makes it easier to prepare hidden-depth questions kind of template to make developers aware that there is something like this it's not really hard but if it's something technical a command line switch or whatever to enable make it easier to Well it's not that easy because it depends on the configuration part as you can say here I've hidden the problem of reading the configuration and modifying the configuration but for this template it's always the same with the name of the the question. Well the name, the type and the default actually. The description can be the same though it should have some information or relative to the relevant for the option that is being enabled. It's just a political thing I don't think that this is hard but just to make people aware there is something like this and you should use it. Yeah I'm not sure if you can provide we can provide examples and documentation and maybe a framework but I'm not sure if you can provide easily enabled hidden that question because it depends on the configuration part of the packet and if it doesn't have level configuration they actually have to modify the file that was shipped in the packet and put in the right value and then get really ugly during the upgrade I'm always thinking about if you make it technically very easy people will use it. If not I will ignore it because I am not reading this kind of presentation Yeah there is a good point I just don't know how we could make it technically easy to fix or more technically easy to fix Do you have a question or comment? Alright Yes I think that question is a good solution for simple applications and desktop applications mostly but for complex ones like Apache it's pretty much hopeless to solve the problem we care about you can't make sensible get-on dialogues to use it to specify which virtual holds or that kind of thing it wants to use it doesn't not only solve the problem I tend to disagree I think it's possible to provide that con- proceeding for Apache that will make it configure the way you want it with for example which modules do you want to enable or what virtual holds name do you want to provide and make sure that it's going in some given include file the Apache configuration will read appropriate location in the configuration I'm pretty sure that's possible to do and it will be useful to do too I sometimes wish that question would you like to be user director please accessible and would you like to have the doctor access such basic things that you always have to change should be in a low priority that question I was just going to comment that kind of thing is probably possible but whether it's wise to add all those questions that is another thing that kind of questions are something that will confuse most users which brings me back to the misunderstanding mentioned earlier you don't have to show these questions to the user you only provide them to this custom Debbie on these folks so your objection is not valid you don't have to present the question to the user it might be completely invisible for all installations do you think that custom Debbie on installations would be satisfied by the level of connectivity you would achieve in the conflict I think Apache is something that you have to tweak every time I'm not so sure it would be quite enough based on my experience with the university you don't have to tweak Apache configuration the university provides a default configuration that works for 90% of all the servers, I think they have like 100 web servers at the university and the 10% will have some extras and just like 1% will replace the configuration because they have so complex needs but most of the custom Debbie on these folks will be happy with this small set of configuration options and the rest maybe is happy to take the path to replace configuration file to be used instead of the default one so I'm pretty sure it's possible to satisfy all the custom Debbie on these folks installation wise installation time configuration yes I think there was a question somewhere comment yeah I would have a question you are saying that the Debbie developer should add the option for custom university distributions to configure the package for finding the problem without seeing it from the side of the Debbie distribution I have no idea what options I really need quite a complex package and it's a lot of work to support the option we have so if you need more options I would expect that the custom distribution context the maintainer says I would need that option and potentially implement it and perhaps even as attached then it would be accepted and so yes actually once in school the Linux but I expect because it was quite a complex package and I don't feel as supporting it without understanding it first and that's actually the problem it's a good point it would be great if all the Debbie developers could read lines and would hand pick the important ones from all the school Linux developers but I think we need to see the information the other way around we have to talk to the developers and let them know what we need also we need to talk to each other all the custom distribution needs to coordinate and see if they have common needs or if there are some difference based on my experience the needs are pretty identical but I'm pretty sure there will be some differences between them and yes we should do a better job of talking with the maintainer and with upstream start a dialogue with each and every and package maintainer we need to configure but on the other hand we don't have that many people capable of doing that so we are fine but we are not we are not there yet and we know it but you were talking about 20 packages in this example so if you try to get a talk to 20 maintenance I don't think that would be a great problem well you would be surprised sometimes we have been successful talking to the maintainer of a package it's got the configurability we needed to configure a package and then a new maintainer took over and ripped the whole thing out so it's kind of painful and you normally well we are all kind of busy doing everything so we spend 5 minutes now and 5 minutes then and when you need to talk with a maintainer for probably like 6 months to a year before they get it of course you don't get to sit down with them and explain it to them you get to send an email and then wait a while and you get to them email back and you wait a while and you send an email and you wait a while and it takes forever well the technical solution and the solution the solution is that we invite everybody to deliver those packages used by Scholarlinux to a party for a new year in Germany so I want to see you there we want you we want you that's an idea that came up during depacons now that we shouldn't actually have sent you an email telling you that they didn't need you, love you because we used your package and you are really welcome to come to our anniversary party or release party or whatever free beer and then force you to listen to a talk about singing and all the people are out to see they get a ticket you can only fashion the ticket but we will try to make that happen that's a good idea I will do it I have a database my question is if you go over to group maintenance do you think it would be possible from the main power point of view that every interesting package for Scholarlinux the group maintenance by any Scholarlinux developer you have to close the group onto the package that would be great but I don't know if we have the main power to do that we try it's nothing like that this is the last question it's like you don't need to have all packages maintained by Scholarlinux people it's something like that something like that I interpret to mean that for every package which is group maintained one of the members of the group that's not what you meant yeah, that's what I mean I'm not trying to put scale but it's a good idea it's a great part in the battle with all these people there is room enough so I think we have to start yeah, thank you