 Okay, let's talk a bit about Lincoln, how it works, and how you can modify it and what we intend for the future to improve it. That's essentially these three sections in another order. First they will talk about how it works, then I will tell some things that we discussed and we want to improve for the future. And if I have still time left by then, I'm not very good at timing as something like that. I'm going to talk about how you normally fix bugs in Lincoln, so if you want to do it, you can do it right and cause less work for us. So, the short explanation of the history of Lincoln, is by the role of Patrick Schitz, started in 1998 by Christian Schwarz and Richard Harkmann. Much of the code is still from then, and this is sometimes an influence for the development, which was probably one of the things why Lincoln was started some time. The other thing might be that it's in parallel with some people don't like it. So, it has switched between us very often. I have listed a short, I have put together a short list of all the containers that it had. Currently we have switched to a container group, which is June, me and Marc. Other people like Colin Watson and Joey Hesselso have been or are still directly contributing, but I don't maintain as such. It's currently in the subversion repository on the private server, but it's planned to move that to an initial project, the initial project already exists, but joist as the end. So, we can't change it at the moment. Yeah, but if you want to have access to the repository, it's very easy. You can do it in an automated, because they count it in an automated way, as you can see. Let's see how Lincoln works. We have the Lincoln front-end, which calls a lot of other scripts that are in different directories, either in the source or in the source state, in the user share. We have an unpack phase where we unpack the depth or the source package and to make it usable by the other scripts, essentially, and we expect some information from the changes file, from the DC, what we have. Then we have the collection scripts, the last scripts that try to find information, like running a file on all files that we have constructed, or just searching for files like all main pages. So, you can see that it's done by the scripts and checks that will then use this information that is available to try to find hours in the package. The output can be piped then to Lincoln info, which is done automatically if you call it the minus e-switch, which will process the output a bit and make it more informative. Each we've got a bit more detail about each phase. In the unpack phase, we have different levels. I think that it's still from the beginning of LinkedIn and it's mainly a performance thing, I think. So, if you only want to run some checks, it is often not needed to unpack the full package. So, it would be faster if you don't have to unpack it full, but I've never seen anybody run LinkedIn with all these options, like run only just checking or something like that, so it's probably, I think it's always compact to level 2 in any normal run, so it's not really necessary anymore, but I mean the code is there and it works, so yeah, we would always change it if it's really necessary, but for my own purpose, we first unpacked the control tar, I have left out the suffix because, now a days I think you can change it even, not in the archive, but in the individual packages, and if LinkedIn should support these two files. It unpacks the control tar, so the dbn directory, and generates an index of the files in the data tar, so it basically calls the package depth, minus c or tar, minus t or something like that, and yeah, and then between them, we still have the full package, so we have the dbn depth meters, I think, yeah, for those packages, one only, it's like the information from the DSE, three, four months, and then two then, and adjust the quality of those minor things on the source package. So, we now come to the collection scripts, for checks and for collection scripts, we have always two files, one file is the script itself, and the other file is the description file, I've given you an example here, this description file is, typically they've been confirmed by the format, and just listed some info, especially some material like the alpha, which might be nice to know, but the important part is that it lists the type, so the script will not be called if you test the source package, and the collection makes only sense for binary packages, and it states the required unpack level, and it can also indicate that it has to be run after another collection script, because it uses information from there, so there's all the two bits, almost all the collection scripts have all the one sitting there, and this one will only be called in the second part, and it requires that the filing for the collection script is run before. This is needed because you can decide that you only want to run some collection scripts, and this is some kind of dependency handling. The collection script for this, for example, is only itself, that runs over the file list, and it will cause a checkup on each file. The check scripts have similar description types, but there are more information in it. The type unpack level info is similar, but we have also done the text, where the name of the text is referenced in the check itself, but are there info like, is it error or running or info, and the extended info that you can print off with limited info, it's all in the description file, and it's not mixed with the code. I have put here a similar example for a checking script. It's a normal kernel module. Previously it was part scripts, but we changed it to use kernel modules so that we can just include them instead of having to fork and exit the script, which is just a waste of time. We planned this to do for the collection scripts too. It uses some internal libraries, which essentially give you the ability to output the text. It then opens depth files control, which was generated by a collection script, and it's just the normal control file. I have only expected one of the text here, one of the text, all versions script as long as it has more checks. You can see that it just use over the file, and it searches for the duplicated fields. I think it's very interesting. It should be sensitive to the output, but otherwise it's probably too fast. After that you can add the info. This is the output you normally see with your own engine, like error, text name, and some more info. About where you found the error, or what the problem and question really was, and you can type it into a litty info, which adds these extended info, where we describe the problem in detail and tell you what you can do against it, which just takes the text from the description file and sweeps out HTML text, which we use so that we can use the same descriptions on mutin.event.org, and we'll just text this window. Nothing very interesting, it's a very short script, but could probably be extended to much more useful, but more is now in the further development section. Questions so far? I have a question. For example, from the structure, why do you need, for one thing you have dependencies, like you need info file info, and you have something like the target, which is object.info. Why do you also need the order thing? I honestly don't know. I didn't check the code a little bit, if it's really needed anymore. I don't know if your thing was added before. I don't know. I will have to check the code. It probably saves you from doing it to have to compute a partial order. No, you don't have to. Yeah, it could be used as a statement like, okay, this has dependencies. And then too late to figure out which one, and just run it after the other stuff. Yeah. And you need info as long as you stand, probably if you choose to only run certain checks and certain collection scripts, so that the needed ones are one, even then, so that you can't break it by collecting the whole set. I guess it's something like that. I think this code we've never touched on. The most work was done on the checks stuff in the last year. Yeah, further. First of all, like you have seen right now in this discussion about the order, the code is pretty old, and there are things like stuff that is written in many, many lines of code, which can normally just use a more modern per feature and can be written in one line. And then the stuff that we run all of the scripts with exit, but it's all per. I think we have one hash filter also in a little somewhere, but that could be easily converted to per. And it's probably way faster if you just include the whole thing and instead you have to fork like 30, 40 times just for one package. And also because if you would use modules all the time, you could pass objects and know what we want and would probably save some of the many file bits. The production should essentially dump all the things to files and the checks scripts are just written again, which is also a way of saying especially when the checks scripts have passed it into again. So that's... But that's the kind of stuff. It works and if you don't have the time, you don't really want to touch it and then have to do all the packages you've already introduced at the beginning. Also the front end code is really horrible. It's just a really long code script which is very unordered and we have some code that is at a certain point and we don't know why it's there. Yeah. You should remove it and see what happens. I don't know. Yeah. And it's all the stuff that is used for checking Internet mobile where you can check a whole archive and things and this whole code is not very good separated from the code that you only need to run it on a single package and think like that. It works but it could be way more maintainable. So that's in the section if we have time sometimes. Another thing is... Yeah. The current code that is generous, the temporary directory, the connection to the start day info and the check scripts get the info from and it's... Yeah. We all find it very horrible and it's not... The whole structure of the line is very complex and if you are new to Internet and go to quite a new check, it costs some time to understand where to find the info and the code that generates lindian.win.org has to run over the whole web and the index is in a very weird format and yeah. Okay. And there's also another knowledge of the information which results the last checks have been given so the lindian.win.org currently uses a single log file where it just depends the current one where it only runs over the new packages so it doesn't have to check the archive each day so it only runs over the new packages and then passes the whole log file again to generate the HTML pages and it's something that's probably been very, very neat of a simple database and not just a very long file that is fine. And when I'm reading the lab code I'm always confused is there a big difference on running lindian on a single package and on the whole archive when it sounds like it's a whole different thing because I don't have a lab at home The lab is the... Okay, I should have explained this in more detail but the lab is just the name used in the code for the directory where all the information is dumped into and if you run it on a single package it's just a list of packages it's just a temporary directory which is generated all the information scripts are dumped there and afterwards it's deleted after the checks are run you can make a static lab so then it dumps the info and the unpacked packages and if you run it then again on the same lab the packages have changed and if they don't have changed it doesn't check them again How does it know? Yeah, the package is still there Yeah, I think it deletes the unpacking the two stuff like the whole lab and the whole source package but it's the info about the version number and the architecture and stuff like that remains Does it compare five samples or so? Because the version number doesn't help me if I'm running it at home Yeah, okay I don't think it checks any five samples it checks the version number and it just assumes that the archive is the same enough that the two files are the same version number and the same architecture but different contents are in order to archive I think it's actually not that same and you can't break it but that's a very rare case I believe you have to upload the package and then delete it from unstable and then it has happened that one occasion but it wasn't it wasn't until it broke away it was something other but it was very interesting Yeah, it was about that with all the contents there Okay Yeah Currently there is a head in place to use the information from the source package while checking the binary package which is needed, for example, to to guess if we forget it not to be used very good and it's not very reliable it just creates sendings in the lab and yeah, it's very good So one thing we want to have is that the whole lab gets more structured and more documented in first step and that's the whole locking code of old results that could be used in further ones needs also some work Now to a completely different topic use interaction Currently, Linzian only sorts its output in three classes it's either an info a warning or an error These three classes are somewhat defined it should be an error it's somewhat like an AC button it's you are it's a policy evaluation or a similar thing like it will probably cause failure with the source and it isn't a policy evaluation such a thing like that a warning should be anything else an info should be used for things that are only cosmetic but yeah, we have downgraded some errors to warnings because they have too many false positives and we don't want to give people a complaint about it but in the previous Linzian errors were true and we had an AC bar so the idea is to give people a hint a better hint what we first of all want to see beauty is like now but if the information about how sure we are this check is reliable in the second dimension do not intermix this information so that you right now you have a warning that can either mean a very severe evaluation but we are not sure what they mean it's not so important and also this is a good chance to go over all the categorization and check because there are some things that are very weird because it's growing over time and different people have different opinions on the rights of entities treat that a bit nobody has gone over all the hundreds of Linzian errors and checked if they are constantly judged yeah so my current plan is it's actually my pet project of the ones that I presented here have two dimensions one the severity which should work like right now but should probably get one or two more categories like we will divide between policy violation here are the two models it's not covered by policy warning, cosmetic changes things like that probably for categories of 5 and the second dimension will be significance should be let's not talk about things I haven't written yet so we can indicate so it would be categories like yeah, we are sure yeah we are sure what we have seen because politics on this check and things like yeah, could be and could be not like 50% of the time we are right yeah, please can we have that classified in a way that we can follow we take up not space on that we can work on that that we can so since it's not written it's very flexible before that one thing I was thinking about when messing with the code that outputs the text or could also think about making two different outputs one for humans to read which is would be probably very similar to what we have right now the idea was to use such things like question mark after the error to indicate we are not sure let's see how we get this right and perhaps something for other scripts like the whole QA page the developer php the pts think of that that we can produce a more convenient format for them to pass and like xmi and fc and etude don't know right now it's not you have very defined that first comes the severity then comes the package then comes the tag name but everything else where we undefined comes after that and especially if we introduced these two dimensions of waiting errors would be the puzzle would probably be more complex and it would be easier to just use a lot of code and the idea for that file that you're like machine readable file is that you use linear minus i and that file is an input and then you get a complete list of the errors that it was producing and my idea was just you get a new option to maintain just produce just output format or something like that and then the library that currently issues the tags can then decide on this option which output format it generates that could be useful when you usually get the linear output you just get a one liner so that you can like feed linear with that and then you get the complete description of all the errors so you can just pass to that information to someone instead of having to run the complete linear output you can put the one liner into linear info and then only at the description and of course this would then have to be extended to support all the output formats there should be no problem okay so I agree with that with this slide another thing that is more and more more people a lot of tools right now let's talk about the letter format there can be a lot of tools like linty and linda you can use depthtif to search for some obvious errors in your packages compared with the previous versions we have cube parts which was written not so long ago by lastis hinges and these are not integrated in each other or there's no front end to run them all and that's one thing I think it's a useful project because I want to first of all write a front end so that you just say oh this package just built please run all of all of the checks we have on it and just tell me if it's okay that's one thing that would be really useful I mean I'm probably not the only one that has a homegrown dashboard at home that essentially does this but depends too much on my built environment setup and things like that and probably it would be useful if someone tries to find a way to make this on a higher section so that everyone can use it and all of the tools have to find very different errors especially depthtif it's very useful to check for things like where you you lose some library dependencies out of nowhere so there's probably something wrong and or you lose all your files probably something wrong and this is an error that Lintian or Linda will never be able to detect because they have no history they can only see the package and if it's empty okay you probably wanted it to be empty if it says no user chat okay we can say that's against policy but if you have no binaries you probably don't need any but if you write that this and the binaries and you see all the binaries only in the first package but in my new provided packages it's lost and you see it's error so there's a very different kind of errors than I did what we can what we would be able to do is probably the other way around few parts for instance in checks installation some things Lintian or Linda can't detect because the package isn't installed if it's installed you can check for things like broken siblings you can't do that if it's not installed because it might just have a dependency that doesn't have a syn name or it will run it in the postings and fix the syn name so there are checks especially on the dynamic library checks oh it depends on both C6 and C7 or standard C++ 5 and 6 we can't detect this because in the uninstalled state it hasn't any library we can't rely on the judges because the dependencies are not installed so one thing that would be useful would be to allow to run something like Lintian I'm not sure how many codes you can share between it would be useful to find it out in few parts because the package is then installed you can run these checks and it of course all the checks have to be written because they are not yet written because we can't check this but it would probably be possible to use some of the unpacking collection stuff that we already have so that we don't have to write a new broken problem for this it was also a what comes out if you try to hack these checks in you can see I was very disappointed by this one so we will tell it it checks it tries to find out whether you have links against two versions of a library but if you have built the current C++ transition and the library names doesn't change, only the package names so if you have built it in a change route and on your build system the old version is still installed it gets full wrong results because it tries to match the library names and these are still the same but the dependency will ensure that if the package it will not play and it will not depend on two different versions of the library so that's other things that would be very interesting would be to enhance the unpack code to be able to handle fc unpack source packages because it's it's essentially a movie unpacked so we will be able to check unpacked source packages without having to first panic I have 45 minutes left so I will probably skip the part I have now in 10 minutes let's talk about this slide though it's if you want to have it with an Indian it's always the first step to report something like oh my check the check detects a false positive or let's add this check but the most possibility that it's going to get implemented if you provide a patch I wanted to show a really good patch would be for an Indian for a certain bug but it's probably not the timing was so if you want to see it talks a bit about the test suite we have and think of that check the slides and talk with the Montana but I have gone with the timing more but it's also a very useful if you can't provide a patch if you are in a business bug and want to add a new check but you can't write the patch because you don't speak perl or yeah you don't have the time what is really useful if you have the time always think of a sort of algorithm how to check it because it's not always obvious how to come from a problem to a check especially to check if it's useful and doesn't would use too many false positives so if you give us an algorithm and how you can check it like you have to go to the list of files and search for this pattern and coding that in code is not a problem yeah there's many lists so if you want to be informed about what's happening in development or help to discuss the broader changes I discussed just subscribe to it probably will be a mailing list Lynn Jim, Matt, Steven or it's just the algorithm goes through this mailing list by now so you can only maintain a list like module files and stuff like that too but it's not that high quality yeah okay let's see all the other stuff is like only the patch for the specific part okay then I think I'm finished here for the questions I think there are checks you do all the actual data on top of the package that you would want to or the stock package I wonder if there are checks you do on the just data package which you could not do where the package is installed I think you should be able to do all the there the package the checks which is lost okay you can do them all but it's not okay because you can't integrate it with EU parts code okay you can just save one one then because if you want it on a stock package you already should probably be I see no problem right now with it because what we essentially do is package minus minus x and the other thing is the other thing is the maintainer can then can fact up a bit cause you do stuff two good things in postman but it's probably not that common there is another disadvantage of that if you want to run it on an installed package then you need the correct architecture and if you just unpack it you can do that on any architecture no the point was not that the point is if you would run it on an installed package it wouldn't be useful to run all the checks because you don't have to run it twice then the point was just if you run an int and the queue part is on a package and on an int again that would be not useful because you can do all the checks you have done in the first one in the second one too except the stuff that's not on the source yeah of course the source package but we don't do anything on the source package so other questions? thank you