 He's all the time in the East. He's going to be tracing PT, a tool for tracking facts. She is called. Hello, good morning. Well, be with me if my English is not very good. Any time, interrupt me and ask, please. Well, this talk is about a tool that was developed during the last 12 months inside the pack group, but it was then found out that it was useful to other groups and individuals. It's based on some needs that are very important for a packaging group, especially if you're a big one. You have many packages and you need to know some information about them. You need to know, for example, which of the packages are in need of work. If you have 600 and 66 packages, it's difficult to know which one of them has a new upstream version or has bugs or whatever. It's important to have all this information in one single place, because if not, you have to lose a lot of time looking for what are the different problems, like the BTS for bugs or the QA pages for the status in the archive and so on. And related to that is the fact that having all this information in a visually clear way and just looking at the screen, you can know more or less how are the status of your packages. And also, it's important to make the teamwork easier, to allow people to interact not only by IRC, but also looking at what they are doing or what is the status of other people's packages. And for example, what we do a lot, especially when I was not at ED, I wanted to signal the DDS in the group that my packages were ready to be upload. And also something very important that is most of the tools only know about the archive. And when you're working in a packaging group, it's very important to also know what is the status in your repository, because you might be so many problems there, so you don't want that problem showing up after you fix them in the repository. This is one of the existing tools. This is a really great tool, like the QA pages. It has lots of information, but it's only for one package. So if you have 800 packages or so, you cannot use this to know what is the overall status. Another nice tool, the package overview. It has lots of information in one single place, but there's no way of having a television filter in there so you know which one of your 803 packages are in trouble now, or sort them or classify them, et cetera. The first tool was an example of good information, but dispersed in many places. The lack of categorization, filtering, sorting, whatever. And the most important problem of those tools are that they don't know anything about your repository, the changes you have been done in the last weeks, for example. And also, all these tools work with the maintainer email as a grouping key. So if you're in your group, there's different ways of saying that these packages of the team, you won't see them in the same report. And also, another problem that was important to me at the time was that even each time I needed to do an upload, I couldn't do an upload, I had to ask for DD to respond for me. So this tool evolved in some key ideas. Firstly, the core of the information is the BCS. In this case, it's subversion, but in the future will be other BCS tools. Instead of the email of the maintainer or whatever. Also, it's important to know that on every commit to retrieve the information that was changed and update the reports instantly. So you have real time information of what every person in your group is doing. It relies on a few conventions, not many, that are based on common practice at the per group, at least, that allow it to be more powerful without making things more complex. It also doesn't build reports. It doesn't build HTML files or XML or whatever. It just gathers data. Gathers data, and then there are some other scripts that make the reports. There's the web page or RCC feed or whatever. And also in the initial idea was to make it better than the former code that was there that was a bunch of ad hoc scripts. So making modular and more extensible, especially very important is separate the presentation from the logic. So other people that doesn't want to know the gory details of all this can make nice web pages. So this is temporarily called, I guess, it will stay like this, Pet Package Entropy Tracker. As I said, it started as a few scripts, was rewriting last year. And not only me working on this, also Gregor here and Damian Hope watching us from Europe work a lot on this. This is an example of a running instance from the BOAP team. This is the CCI script. You see, there is a high density of information there. There is lots of links and stuff that pop up when you hover on them. But the nice thing about this is that in just one screen, you see all your packages that need a new option version, for example, here. This is another example that I lack a lot, because I didn't do anything about this, but they made a really nice template without changing one line of code. It's the same code behind this, but it's just a different template. Also, it's not only for packaging groups. Myself, I use it for my very few packages. This is the example of the Gregor personal repository. It's quite useful because even if you're not interacting with other people in a packaging group, it's very useful to have a reporting on upstream versions, bugs, status in the archive, et cetera, all in one place. Well, something about the way of working. There are many sources of information here. As I said before, the idea is to build a database that is meant to be used later for reporting. So it gathers information from many different places and it's easy to send to other places. Firstly, the Demian Archive, it downloads the source indexes, the SBN as it now, but it's meant to be used by any other BCS. The BTS, of course, for retrieving all the bugs. The watch files are used to watch the versions in upstream websites. And also, there are many ways of sending it. I can think of the popularity contest, the DEHS or S7 build stat, which is one of the tools that I really like to integrate with this. The other archive, the tool retrieves. Firstly, the most important is know which version the package has on the different distributions, source stable and stable testing experimental. And so that it can know where a package has been removed from testing or from unstable or isn't experimental. And also, it can show you when a package disappeared. Sometimes it happens that you upload a package and it takes many hours to know where is the package now, especially after new processing or so. So it keeps track of it and it shows, OK, you have to wait, it's being processed. Don't worry. I also read all the control data from the packages that provide good information to see which packages you have in your repository, which are not officially maintained by you. Look at the DM upload allowed flag, especially important for DMs, and sponsors of DMs. And also very important, mob source and binary packages because this is the authoritative source of that information. So because in your repository you have source packages and then you have to map bugs which are applied to binary packages and so on. And also read the queues that are new and incoming to track where the packages are. From Subversion or whatever other repository you can think of, the idea is to retrieve. Use it as a repository of files because the most important thing is to retrieve the deviant directory all the files include, but also to read the tags because the tags are used. In a special way, that's one of the conventions that I will get back later. So firstly, the same information is the archive, read the deviant slash control file, but in the current status. For example, you have adopted a package. In the archive it's not yours, but you have it in your repository. You're working on that. So you need to keep track of it until you upload. Know which are your packages. The other way around, you can have something in the archive that is in your name, but you don't have it in your repository, so this is only what is really there because the repository is the key of this. Also it detects patches, change logs are read to track versions and also to track the bugs that are being closed in a commit, et cetera. Watch files which are used to later retrieve the information from AppStream. And what I said recently about tags combined with the distribution using the change log, you can detect which is what can progress. So if anybody commits and the change log displays and release distribution, the program knows that it's not meant to be uploaded now but it's working progress, so it classifies it differently. And it acts also because when you tag, you're saying I have just upload. That's all the conventions I was talking about. A few conventions that I think are good practices and it was based on what was really done in the group. Using the change log of a communication form is very flexible, it's really easy to see it in the web page and everybody has to edit it before committing if you are using the commit which makes it more easy and the change, et cetera. So you put the notes to other people, warnings, et cetera, use the release tag, not tag, the distribution to signal if the package is ready or not and to create a tag on the hit shop load. These only few conventions are always needed to make it work properly. Sorry, my throat is not very good today. Well, from watch files, the obvious question is, is there a new option version? But also from the watch files you can know if after you just put a different version and now it went down instead of up. So it also flags that and also tells you when the upstream website is not working, et cetera, so you can fix your watch file. Sorry about the people who doesn't like watch files but these two considers lack of watch files or problem in watch files an error. Also in the code I needed to write a new scan almost replacement to avoid calling use scan 800 times each hour, which is not very useful. And well for the per group we also use a small optimization which is to download all the indexes from CIPAN which are the module indexes, outer indexes, et cetera and avoid going to the web all the time. As you have seen in one of the examples the per group has in the repository over 900 packages now. So there are some bias to the per group but in any case it's quite a stress tested by now. I don't think many groups have more than 900 packages in their repositories. Well, some visual niceties can be seen from that. You can read it. Well, I will have to read for you. Well, this is the main visible script. This is the CCA, it's currently running on alias. Many groups have their own versions. This is the place where the database is red, the data is mined and classified and then it's passed through a template. This is showing the compressed view with the filters here. You can filter bugs and have some options there. And each one of these blocks are blocks of packages classifying different things. The first reads newer option available. Newer option release available, that should be corrected. Newer option release available but already worked on is the second category. Packages that are ready for upload. Packages that are tagged. So somebody said I have a load but I'm not in the archive so something is happening there. Packages that are new and incoming. Package we have version problems because Aftrin says that it has an older version of the package or something like that. Packages with RC bugs. Packages that are work in progress which is more or less, what doesn't fit everybody else? No, whatever. What doesn't fit in other categories put it there. And also packages that they only program are bugs. It's important to know that if package only belongs to one category, it's just a way of altering the work to make it more easy to read. But it's also, this is just 10 lines of code that can easily change to everyone needs. Well, this is just demand bill. This cannot be seen very well from far. This is the first block, the newer Aftrin version available category. You see in the right that they're all blue boxes. The blue box signals that there is a problem in that part of the report which is Aftrin. The says that Aftrin has a new version than you. You see that in the middle column, the says archive. That's the version, the latest version on the archive, usually experimental or unstable. But it's flagged when it's experimental. And also the version in your repository. You see there that many packages have two versions. One is inside codes, parentheses. The big one is the latest version you have, say, that is released. The latest version that in the change says that is meant for unstable or stable or whatever, but not unreleased. And the small one is the version that is being worked on, the unreleased version. And you see the mouse is hovering on the box column when you hover on any of those links that have a line behind them. You have a pop-up when they show all your bugs and any one of them can be clicked. You see when you go to the version in the repository, you see the last person that changed the changelog, the same for the two versions, the unreleased and the released one. And when you click on the maintainer's name, you see the snippet from the changelog with all the text. So that's why I said that changelog can be used as a communication medium because you always need to see what was the last commit saying in there. So it's very useful, it works like that nowadays. You see there, there was an unreleased version, so the fonts are quite a little smaller. And that's the other version from the same package that was uploaded on this table. It has also some other nice things like reloading the box without reloading all the pages. Web 2.0 stuff. Well, this is the other interesting parts I wanted to tell you. This is the first one here, it's ready for upload. This was a package that was in the last commit was, the distribution was set to an stable, but there is not a tag in the repository with that version. So the tool recognized that as somebody who finished their work and is waiting for them to upload. So you're a DD, you have many non-DDs in the group and you want to help them. You go to this place and say, okay, I will upload this and sponsor this. The next one is a package that is tagged. So somebody said that it has uploaded but it's not in the archive. So maybe it's a problem because you forgot to upload or the packages going through incoming or new whatever and the tool hasn't seen it yet. And the next one is the packages that are new and incoming. This example only shows six packages in incoming. Also, I forgot to mention, these numbers are the number of packages shown and the total number of packages. It's very important that this filters a lot. It tries to show you only the packages that need work because if it were showing 900 packages all the time, it won't be useful. Well, this is just an explanation of the assumed work flow for all the stuff to work as is intended. It's nothing new, it's just what was being used at the per group at the moment. You just first do an initial import or a new upstream version gets merged. You put all your source changes in different patches, so that's also seen by the tool. You use unreleased always when you're still working. You commit often so other people know and the tool gets updated often too. That's useful also because if you put your commit messages on the changelog, the changelog gets really useful information and other members can see what are you doing and can help you. It's quite common to put in the changelog, need help with this and somebody will see it and just try to help and fix it. When you're done, you just change the distribution to unstable, they change, I don't always say anything, and then commit and never forget to tag after upload. If you're not a DD, as I said before, you just end on the commit and somebody will take it after there, hopefully. The components of the stuff. The first component, the complicated one, is the retrieval script. It's a big, not a big script, but it's used a lot of models that retrieve different stuff from all the data sources I mentioned before. The archive, the BTS, SBN, and whatnot, and after. It runs on each commit in a special mode, which is faster. It only retrieves the information that you change and if you change something like a watch file, it takes the upstream version, but only if you have changed the file. All the running, the long running task are done in a Chrome job. It's the same script with just, just with a different command line. Knows how to manage the data relationship, so it doesn't have to reread everything on each run. So it only will go to upstream if the watch file changes it or the time has expired. Same for change log and bugs and et cetera. Currently use storeable files, which is a model, which is quite easy to use as a backend. It should be changed someday in the future. Reporting tools, small scripts, really small scripts, that just take this data on the, on the pseudo database that is the storeable file, which is only a hash stored in the file. The mini script, which is the one I showed you before, is the qirreport.ccha. Gregor has written a nice RSE feed, which show you the new packages that need uploads. It's also a command line tool, which is mostly for testing, but it's also handy. That does the same as the CGI, but just in text and smaller. And there also, a couple of months ago, I was asked for, was asked by the C-PAN people, C-PAN people, to do some reports from them, because they are very interesting in how we work and the tools that we are reusing. They have put some metrics on C-PAN that give points to a package, to a distribution in C-PAN parlance. If the package is in devian, the version is current, there are no bugs and no patches in devian, et cetera. So that's quite inspiring to be useful, even to people outside devian. And it was quite easy just to give a report of data that was already there. The plans for the future is to remove the dependency of subversion, make the component pluggable, so it can support Git or CBS or whatever. And also to build a meta repository over that, so you can have many repositories, for example, having a report of the whole archive, but based on the BCS repositories. Well, also tidy up at the cover a little because the data structures doesn't have any structure at all because they are just hashes, which are easy and well, but that had to be changed so it's more manageable because there's so much data in there that you get lost without structure. And also to integrate more sources of information, I think that many people will think of creative ways of improving this. Oh, I mean, I'm talking too fast, as usual. Well, that's the introduction to the tool. This talk was meant to be, it was meant for presenting the tool to other people that doesn't know about it. I think it's very useful for teams and for individuals. In our experience, this tool, also the previous tools, did a lot of to the work of the team. You allowed for a very few people, very small group of people to manage this sheer amount of packages. And I think it promotes a good way of working. So I want to promote it, but also I would like more people to join the development effort and improve it. So any questions? This looks all very similar to GDPO. Have you thought about maybe doing a joint work and integrating with GDPO or maybe what should be changed in GDPO to make it useful for the Perl Group? Well, for starters, the most important thing is that this tracks the current status of the subversion of the story, not what is in the archive. So that is the most important part, the most important difference to the GDPO. Yes, it could be integrated, it would be nice. And most of many ideas were taken from there. So yes, it could be a good idea to integrate it and allow it to be useful by everybody automatically. Yes. Yes, I liked your mental notes about the new OOSCAN and the new watch file format. What, could you a little bit more elaborate on this? I forgot about that, yes. We talked about that on the other day. And on the both, we were with Andreas and some other people, it came up this question about the OOSCAN. I don't know if many people have tried to write an OOSCAN, watch file browser from scratch, which is quite difficult. But these two OOSCAN are completely separate code. And the way I discovered that there was some problems with the current OOSCAN tool, some inconsistencies, most especially some things that cannot be done easily. So yes, one of my mental notes over here, there. Yes. Yes, I have the idea to release it some day when I have the time after that, of course, as a replacement of OOSCAN. And also to design a new format, version four or five of watch files, yes, to make it completely different and sane, which is really needed. Anything else? You have plenty of time. I just speak so fast. Please talk to the Dev Scripts people about that. So you don't, maybe the old OOSCAN can just be replaced. I know it's a horrible mess of spaghetti code. Sorry again, please. The old OOSCAN is a horrible mess of spaghetti code. So I think the Dev Scripts people would be happy if you wrote, gave them a replacement. Oh, good. Yes, it's really spaghetti code. I spend days and understand it. I guess this will be a short talk. Anything else? Hello. So there's a poll group bias at the moment. How easy is it to put up new instances of this? To what? To put up new instances of this tool. Ah, to install it from scratch. Yes. Well, it's not really hard, but it takes some work. This is not currently in the archive, it's not package it. It should be at some point. It's the official version is in the package parallel group repository. You have to put in the same server as the SVN repository nowadays because it relies on direct access to that for the commit notifications, yes. And also for speed. You just have to put a hook on the commit and the post commit. You have to put a cronjo, set up the configuration and just a few links in your web server. It's not that hard. It takes maybe one hour to set it up completely. There's already, I don't have the list here, but there are like 10 installations by now in different groups and personal installations I don't know how many, but it's not that hard. It's only a little heavy, I have to say, because all of the information, but for modern machines it's okay. Fedon? As a user, it's fairly easy. As being one of the installations, having done one of the installations, it's fairly easy. We have it for the VoIP group. It's nothing to set up. It's like half an hour or less and extremely easy. Great to know. Well, that's it for the question. Well, thank you for coming and listening.