 Good afternoon. Today we have Andrea Stille talking about healthy custom Debian distributions. Let me ask you to, when you're going to ask a question, please wait until the microphone reaches you and look at the camera and say your name. Thank you. Thank you. Hello. I hope I'm not sporing you with these two talks on one day and I hope to make it a little or quite different from what I said before. And apropos healthy, I do not know who just joined the rice wine and cheese party yesterday. I realized I was not able to wake up two hours for two hours. Normally I wake up two hours earlier and I want to know which wine was responsible for this. So I would like to do a test, makes the amount of wine from Argentina, Brazil, whatever in one slot and the other wine in the other and then test those wines. If it works, then we have finally by splitting them up in different groups and find out the wine which was responsible for sleeping so long. But so now for the real topic. I would like to say something about the history of Debian Mids very shortly. And my concern is to show you what might be a good idea to build a CDD or just a custom distribution which works in a specific field. And if you try to do this, we have had several tries which was in most times a failure which things you should do and which things you should avoid. And I think I found some way to measure which is really healthy and which is not a working CDD. So this is my part three of this talk. So what about the history of Debian Mids? It was just at Debian Conf 1 in Bordeaux. It was also known as Libre Software Meeting. This was kind of cool conference. It was the first two Debian conferences happened in Bordeaux. It was called Debian Conf 0 and Debian Conf 1 and it just happened that we had a formal dinner as we can experience it tomorrow in this conference. And in this day happened the following. There was a medical track which was a reason why I was sponsored to go to this conference and there was a guy sitting in and was constantly busy translating installation advices for some medical program. I thought, well, this guy is busy translation installation advices but should the installation advices be translated or should we rather have an installer for this program? So I mean build a software package, say, Upget, Insta, whatever. And so we had some wine and then we started thinking about it and so we stopped drinking wine and made a laptop open and start preparing a talk about how to do things right. Instead of just translate some taxes, nobody really wants to read because who really wants to read install advices. You want to run a program and not are busy with installation. So in the process of this formal dinner with some wine, I prepared the first talk about the not yet existing project which was called later on Debian Made and that means we began from zero. So this is an optimal situation. You cannot fail if you are growing from zero, something is happening. This was quite easy and the idea of the custom Debian distribution was actually born in the year 2000 which was rather called Debian internal projects. The idea was from Ben Armstrong and he thought, well, my children are small. I want to make them, I want to make a system which they want to use. Not just, father, can you help me that I should be happy about the system and keen on using it. So the idea was we assemble a set of nice games for children and find a way to install them easily. I adopted this idea for Debian Made and Holger told us there was Debian Edu and Scholar Linux growing up in parallel. The Debian Edu project somehow died and was not maintained anymore and so my suggestion was to the Scholar Linux people take over Debian Made. You now are Debian Made and this became somehow one. And further in that country in Oslo, there was a decision, this Debian internal projects should somehow differentiate between other internal projects like Installer or internal hardware support or so to differentiate between technical projects and user-oriented projects. And this was the birth of the name custom Debian distribution and I admit I'm absolutely not happy with this name because it's so confusing. I mentioned in the talk before if you hear custom Debian distribution you think well the name custom it is something else and Debian somebody takes Debian do some stuff and releases it and this is actually not what we want. We want to make a system as much integrated as possible in Debian to well to keep the maintenance efforts as low as possible. If you are deriving from Debian you have constantly you're constantly busy to change the installer or whatever. We want to have one handle to switch and to make the installation medium for custom Debian distribution very similar to everything else which is plain Debian. So we have other projects like Debian accessibility. It is oriented to blind or visually impaired people and it's these people do not actually call themselves custom Debian distribution but in principle they are doing what we are doing as well. And I come back later to why I'm list this project here. We're quite similar it is Debian desktop. It's well it makes no sense to talk so much about Debian desktop. On the one hand it is not a real custom Debian distribution in the sense as Debian accessibility because I don't accept the techniques or whatever and it is kind of well not so it's kind of silent this product. There is not so much outcome. In Debian Lex which was intended for lawyers to make a system for lawyers I think it is a really good project from the idea but it was more or less run by one person and this person left Debian somehow and that is not so much outcome anymore or more or less non outcome. The Debian non-profit project tried to build a system for non-profit organization. I think it's also a really good goal. You could do several stuff to maintain some agencies or maybe conference maintenance projects or something like this. Unfortunately this more or less died this project. The project list is even close now. There is also Debian Giz. Debian Giz is a very interesting project. They try to assemble all these geographical information systems. They did not yet accepted our tools but they are currently busy to do so. This can be called a custom Debian distribution and so far because everything is also inside Debian. It is quite similar with Debian Giz. It is for chemistry or chemical packaging chemical projects and the problem of these both Debian Giz and Debian Giz is they concentrate mostly on just getting projects packaged. We are more or less a packaging team. In my opinion building a custom Debian distribution is more than just building packages. It is about contacting the users and building a real framework to make something people really want to have. Not only that it is nice to have some ready to install software but distribution is more than assembling and collecting software and putting even more to the archive. We do want to do a little bit more. Finally Debian Enterprise. It is a cool idea but I would call this project a failure and you will see why it did not work. I will elaborate more on this stuff. I thought about how can I measure the health of the CDD. I have to do some statistics, some numbers. We could think about the number of users of the CDD. So free software is a problem. You don't sell anything though you have no real numbers who uses your stuff. What do we have? We have popularity contest. I guess everybody knows if you install Debian CD you will be asked are you happy with reporting which packages are installed. This popularity contest is a nice thing to obtain the numbers of the whole archive which packages are used most frequently. But here we have a problem. We have by definition special users and the special users need some special software and so the popcorn numbers will always be low for the important package. So the popularity contest number is not really a good measure for this to find any answers how good CDD is working. We could try to make this measure on the meta packages which meta packages are used and which are not used. The problem with the meta packages is popcorn counts the number of installations and it also verifies that the files inside the package is really used. So in first time if you install Debian you can install a lot of packages but finally you use only 50 or 100 and so the number of installations of the meta package can be counted but the meta package does not really contain files which are used. It is just used to get other packages included or via the dependencies and so you have no real measure whether this meta package is really used or just installed. So we have no real handle to count the number of users. What about the number of developers? Well you could look at VCS commits but not all the CDS really use common VCS and so it's a little bit hard to make an equivalent measurement between the projects I'm talking about. And well I thought the activity is some kind of reflected in the mailing list. The first numbers you are thinking the mailing list subscribers so but this doesn't work also because well we have different projects. I have just got the mailing list numbers for four of these projects and you see somehow in the beginning of the project the number of subscribers grows and then you get to kind of plateau and the problem is independently whether the mailing list has some postings or not the subscribers seem to remain somehow it is if there are no postings they even know that they are subscribed to this list or it is in the spam folder or whatever. So it is hard to say well we regard the number of subscribers is it does not scale this number doesn't work and it is a little bit better with the number of postings you see Debian Mids the number of posting is continuously increasing. Here you have the effect with Debian Edu there was not much in Debian Edu until 2003 and then Scholar Linux joined and this makes a huge amount of much more males and now this project is really good working. So there is a different effect in Debian Junior this was the first CDD and you see that the number of postings tends to decrease. You have some peaks and I will talk about these peaks which somehow well it is hard to say something about this graph because there's much noise in it. These graphs also reflect some several spam males and whatever automatically generated males so this also doesn't really help and if you look at this Debian Lex this peak I know this this was a sweet about it should we revitalize this list so this makes one peak and doesn't happen but you can't obtain this this fact I know from this graph so the numbers you have at first in mind from the mailing list of some number of subscribers and numbers of postings do not really work to obtain a measure what's really up with this project because you have a single noise and several artifacts and it is not so that only spam is a noise I would also regard as I said commit males of VCS which are sometimes posted to the list it's also not really relevant to obtain whether this CDD is really working as a as a community as some project or just as a packaging group you have also some males with unsubscribed unsubscription did not work please could you subscribe me and something like this so I mentioned this peak of the Debian Union list in October 2003 it was only one street the street threat it was about should we replace RB words by looks with open office and then there was philosophy of the text processing program or future plays and whatever so he had 79 postings about a very specific topic out of 89 non-spam messages so this was kind of flashing light in this list but the tendency is unfortunately quite decreasing then you have this kind of robots for instance there was in December 2009 92 messages to Debian desktop org which was kind of subscription unsubscription to some funny gen 2 list I don't understand the sense and it's nothing against gen 2 but I just was looking at it's a mailing archive and I thought oh what's that so many males but it was nothing and this automatically mails so this are just artifacts which are somehow yeah it just does not work how to how to measure the health of a CDD so a different measure would be we have these major packages and what is the number of dependencies if the dependencies are growing people are working on this and it seems it seems to grow healthy this is most probably a quite good measure and the problem is it is hard to obtain because up to up to I think the beginning of this year only two of these CDDs was they're using this technique with the meter packages and so you have no measure for the others because they just did not use this meter package technique and they also did not necessarily use the VCS where you can easily obtain the history of this meter packages and I did some querying of snapshot that they be on that net and because I even didn't have had the numbers for for the project I'm running myself this deviant made but I was able to obtain some statistics I was quite happy about because this deviant made is divided into biology microbiology part you must know it sounds strange made if I go to the doctor what's about biology but I'm working in a medical institute which cares for finding out which influence has which virus and whatever and they have to do some genetic research and there is some comparison of genetic code of bacteria and so so we have some very specific bio microbiology software and there is a lot of the software we have yeah we are scratching on the tip of the iceberg if you want to have all of this software we had to numbers of 500 or so but not every it makes no sense to pack a package everything we try to package the most popular one which is used by many people and so you see a large increase here and even in the beginning of the deviant made projects we had some five or six microbiology packages this is the reason was simple we had one deviant developer which was working in the on the French Pasteur Institute it was Stefan Boltzmaier how it's pronounced and he put it into deviant some random microbiology package which he was using and then he left deviant and they were all found and so we just I just took it this over and tried to bring them in good shape and started with the first meter packages and we also had some something for medical imaging so in the beginning there was yeah I said nothing those are well some packages but not in real structured shape and you see a constant increase of the number of packages that means the interest in the deviant made project was constantly growing and even since 2005 we have some practice management program it some we have one practice management program out of 20 existing but all these 20 are competing with different programming languages with different databases I want to use and are mostly done by one developer who has a strong opinion about his program and is not compatible with another developer who has another strong opinion so the situation in the free software in medicine especially in practice is quite hard this is a real problem to find straight line there and my intention is also to propagate the project I think which are serious and which are developing quite straightforward and that's why I try to package exactly this to to make them a little more a little bit more popular and I hope that this will also increase here over time this is only a selection we have some more some more meter packages but these one are the most interesting one for this graph to make it not crowded so as I said in principle the number of dependencies in a meter package that could be useful but we do not really have this number so what about the activity of members as I told you the number of subscribers and the number of postings are not very interesting and here is a very drastically example of the deviant enterprise project you see you have in one month six six postings maximum but 140 subscribers so well this is a really strange one but when I've seen it I wanted to know who is actually posting to this list there are really are there really persons and so and I got the idea that well I called it communication measure whatever activity of case activation measure and I wanted to know who is actually the most active post on a list is it changing over time is it always the same or well every CDD has a mailing list and we wanted to get rid out of these robots and it if I look at the people who are actually posting I get rid of the spam because spam is sent by not not the same person and so it gets out and all the flame bars and so the effect is reduced because you are looking at specific persons so I tried to to fetch the top 10 posters of every mailing list that means I was browsing the index of the mailing lists over the years counted which which poster is the most frequent poster and the other 10 they are displayed over all the years you will see and I would like to mind you especially the called the so-called run over by bus factor what do I mean with the run over by bus factor this is a sample of deviant made to explain the graph here Andrea still it's me okay I was starting the project and starting a project makes some noise this is normal thing there are many postings and if we forget the first year the the amount of postings of my posting is is increasing except of 2006 where I had some private stuff to do and was not active it is quite interesting that this is reflected here and until 2005 if I would have stopped posting the list would have been died I'm really sure because there's nobody else who is that active but the really interesting thing in 2006 there came a second one is Charles Plessy this was actually here where I was decreasing with my activity and he would then came became a well a second very active member who is completely able to take over what's what I'm doing and I'm I'm really really happy he cares for so many interesting thing organizing stuff and making release stuff and all the things so this is what I would call a healthy CDD if if they run over by bus proctor if I mean if I if the catch if I would be catch by a bus and could not work for anymore for deviant made this project will not die because somebody else is there and will take over if you look at the third this David Palaino he is a very active person who does some stuff who joins the project in 2007 and he is does not so much about coordination but he has done a great work has written all these web stuff I mentioned in my talk this morning and so we have three very active developers the fourth is Kasten Hilbert he is actually a user this is also interesting that the user was also from the beginning here and this user is in in in this aspect but it is a the developer of this practice management project and he is an upstream developer but he uses Debian and so far he is a user and acts in in in the sense to well I would like to get my my program packaged by you and I hope you can help me and and has some inspiring ideas how we can make our project better and the other ones Nelson de Oliveira is a Brazilian guy I would have hope he is here in Marde Plata but he didn't he's also a deviant developer and Stefan Merler and we have also some competence in this project well I myself I'm a physicist so in principle I'm not really competent for medical stuff but Charles Plessy is a biologist David Palano is studying odontology Kasten Hilbert is this doctor who programs Nesent Oliveira is like Charles Plessy and also Stefan Merler very good in microbiology then we have Michael Hanke he is a developer of medical software so we have some competence and you see have seen on the graph says medical imaging is growing so if you see the people who are working on this you see that the development is quite interesting and what I also want you to mind is that we do not lose people at first this this was quite sparse but in the end people are keep on with the activity and this is something I want like to remind you because in other projects people were active and then dropped the activity and this doesn't happen so I would call this healthy what about Debian Edu I told you in the first two years that did not happen so much because this Debian Edu project sorry this Debian Edu project needed some well external blood or whatever and in 2003 was the conference in Oslo and then said the Scaldinos people joined this project and it starts to become what I would call healthy they have the main activity is by Peter Reynolds and he's a technical person Debian developer and he makes such strategic decisions and so on and the second active posting process is also interesting is a teacher and well more or less a user but this this high user interest is also very important thing and this is very good that this person is constantly posting. Horger Levson should be known to you from the talk before you see exactly when he joins the project with his activity it started in 2005 a little bit and then he became very active and so you have two main developers there and so the run over by bus factor is also greater than one here you can see also Jonas Medigot is also quite active and Knud Irwin is the guy who is a very good politician he is convincing the Norwegian government that Scaldinos is actually exactly what they want to have and so he is really great so the type of the main active people is different and they connect very well together so this is also a very good situation if you see my own name there it is not because I'm so deeply involved in the project but I try to find interconnection stuff with the healthy CDDs and so I'm looking on this list and are posting the general things which are interesting so I would not really call myself as a strong member in this team but the six people before me are a really strong team so what about David and Union I told you that the initial idea was from Ben Armstrong but you see also that if you start a CDD you have a lot of postings I have this idea what do you think about it should I do this and that and I was quite happy with this project and was responding to this idea from the beginning and the project was quite active until 2002 and then you see a decrease in the postings of the main poster and this decrease with the exception of 2006 where he wanted to well please guys let's revitalize this project it did not really helped so much and you also see the project was losing people here there were some active people they are not in the end so this project is obviously dying and this is a shame because it's I think it's a really nice project and it should survive somehow and the last males over there where Ben Armstrong is a really clever guy and he is a good strategic guy and he said well my interest just drifted away my children grow up and so my interest is now in the EEPC could somebody take over this job and he said this job contains several points and he also has written a nice documentation what what you have to do if you want to start such a project what what you should be aware about it is in I included this in the documentation for the CDD this is Ben Armstrong's part and he tries to try to make an effort that somebody else takes over this role of the main active poster or whatever I hope he's successful and I think if I'm ready here with this all these preparations I try to help him with kickstarting in a new a new life so the next thing is Steven accessibility well this this project by definition by definition has some problem you have not so many users for this there are not not so many users which are blind and or visually impaired and so well there are such users but it is a quite kind of special software and they have to work very hard to to to get some users or get some attraction and I know that Mario Lang is in Debian developer who is blind himself and that's why he started this effort I don't know the other ones but you see you have some posters which are on on the top 10 list which are doing very few postings and so this is not so active and I hope that we could do could give some inspiration to the project by implementing by letting them implement these CDD stuff which gives perhaps some visibility to this project it is I would like to say it is not visible as the people can't see themselves or whatever it is it's a shame that is not so vital but it is just a problem of the number of users well Debian desktop was in effort to well this is a problem with the flexibility of Debian that we have not the most up-to-date software and so they wanted to bring they wanted to implement the system to to bring unstable in the shape that is releasable and somehow the idea was not bad but I think they were overrun by efforts like Ubuntu and so on the interest was in the beginning not so high then they got a peak by Gustavo Franco but he obviously dropped lost interest and stopped posting and so well it's not what I would call healthy Debian Lex that I said it could be a quite a competing project for Debian made you have about the same number of users I think there might be the same number of doctors as lawyers whatever and they need some special care so the idea is good but even if you look at this is the starting point it is not that high as in the other project so there was not so much energy in it and the blue peak is done by a user who calls him himself Lex list well it is anonymous I don't like it if somebody has something to say he can say it with its own name and he found this name to revitalize this list and this that's why this is peak and the other responses are from the other people but though in principle you can close this list because there's nothing it's a shame but we just have to face this so and I just want to make you aware that this method I found to evaluate the list works in this example because you have this peak seen this blue peak and this is about Debian reverse so all this posting are about one topic and well it also makes no sense if they are talking about do we want to implement something in Python Perl or PHP this is a specific technical detail it's it's nothing what helps our users the user wants programs that work and not programs in Python or programs in PHP they want programs that work and if a project is in a situation where they want to decide this technique technique is preferred this not then this is not our business they should find good quality software and bring them in in a real good system which is ready for users and not this kind of discussion so this is also a good example for the I call it vanished leader project then we have this Debian non-profit custom distribution also here we have a main poster he got some other posters who sometimes posted quite frequently but in principle this list is dying and the last peak is only should we remove this mailing list and is it is just not existent anymore so there we are an enterprise this is funny list where you can get on the list of top ten posters if you have two postings this is quite interesting and this is even a spam mail so so we should definitely close this list because there's nothing the idea is very good and it's a shame that there's just the idea and nobody who cares but this is a problem yes we we created this mailing list somehow in 2007 it exists only for two years yeah I think it formerly I thought it was somehow there was an effort of iron murder to make an enterprise linux or something like this but this was somehow formal so I do not really remember this sense but the idea was well even has no certification for for running oracle or so and it is interesting thing to to make Debian in that way enterprise ready does that we provide ready to run solutions and I would have loved if this effort would somehow get some something but just nothing so then there is a general list that is not a specific specific pro project but it was my intent to bring all these custom deviant distributions together to to find the common technique and here you see in principle this is also not really healthy what you see here the reason or my expectation what might be the reason is that all these projects somehow independently existed and never talk to each other and then constantly busy was trying to to to make these people joined together and then try to use all this is the same technique and finally I found the reason because just talking doesn't help you have to provide some technique to use which is so act which is attractive to know enough to make people want to use this technique and this is I think the these task pages I have shown in the morning are somehow convincing to to other people's and those are other projects like this DB chem project which care for chemistry programs well they just concentrated on adding more chemical packages to Debian but at some point I found out well that's not the last and this is exactly the idea I had and when the main he would show up at main post I have no studs here the main leader of the DB chem project which is michael bank have seen this task page said oh that are you doing that is this cool that I want to do this all the time but I found out for the time so I think we have the next two or three CDDs which are trying to to use the same technique and I really hope that the studs of this list will grow again because there are a lot of common things to discuss on this list and so you see several people you have seen before but you see me or Tavia Salvador was very active and the deviant non-profits project and Ben Armstrong the deviant union man and Jonas Medigot is in the deviant edu project and this is also a very interesting entry here because what runs all the exactly this is graph is wrong that this this is one person and he would get even higher I have to fix this he is a man who has written a tool which is called simple CDD and simply it's CDD well what is it is a nice tool in any case but it creates something which is then lifts outside deviant it is an installer for well for something else and deviant in principle and so this is in principle a famous misinterpretation of the name custom deviant distribution because it is not really what we want but well he said well CDD it's what I'm doing and so this list is also even misused by people posting to this topic so we have some mix up here but we didn't we didn't finally we have finally nothing done against it because discussing discussing things about special purpose might not harm the list but definitely we need some some more straightforward topic here and in the list and this is a little bit but I'm not happy about this situation but I'm hopeful that we can increase it so what's the conclusion most CDDs depend from a single person and it is good if you have at least two or three which are really active deviant edu is really healthy because it has a strong crew and also deviant mid has kind of good to run over by bus fact bus factor for well two or three persons and this is definitely very important to have a successful project and those projects who has only a single person are somehow dying at some point because you can't rely as a free software developer will always work for this topic there are so many attractive topics around you can catch and so well you become distracted and the former project has is in danger to die and we need much better management in between the CDDs this is also the main conclusion I would like to draw and at this point I'm in principle at the end and you will find this talk soon at this location and yeah I would be happy if you have some remarks for for better communication or whatever so any questions yet also not so many questions my first talk as I said well either I have explained everything so good that you have no question or I was completely I have completely failed explaining what to do so but there is no point in sticking here in the room if there are no questions you can reach me all the will conference I hope you enjoy the talk and if you would like to prepare some TDD just keep in mind what I have told