 Hi everyone in the room. Hello everyone online. Thanks for joining for this shot one. I'm Stefan, I work at TDF. I'm part of the team as a QAA analyst, coming close to a year working with the team. And what I'm doing today is presenting a bit of an update on a dashboard that I've created a while ago but has kept evolving. And what that can hopefully mean for the QA people, for the QA team, the QA project, but as well hopefully helping out more on the development side to give somewhat some direction getting to use that bugsy-lad data to try and inform what we do next. So we'll go back and forth between the the tool and the presentation just to give you an idea of what it looks like now and potentially to make you want to try it out, see if it can help you in your work in any way and also to let you give us feedback and me feedback to see if you've got other ideas how we could use that data better. So we'll have a quick look back at some QA stats over the years, or at least over the last couple of years, how processing bugs and bugsy-lad data can help QA and developers and try and inform development as much as we can. And I also want to talk about my impression after working in the space for quite a few months and talk about how much it takes a whole community and communication and interaction between different projects to keep things moving forward. So in recent months and since roughly the end of last year, the end of 2022, we had a good reduction in unconfirmed reports or reports that usually haven't been treated, haven't been processed, triaged by the QA team. And over the last few months, we also have seen that this triaging pace slows down significantly and we're getting to a point where it's a bit more stable around 1100, 1200 unconfirmed reports, which is quite significant still, but I have to remember that we started over 1800, which is largely the impact of the uptake of Librofis in COVID times as well. So as this triaging pace slows down for various reasons, we can also see that the open reports on the platform are going down. And it makes sense that if we are triaging reports, then we've got more new reports that need fixing, identified as actual issues. So you've got that increase of in open reports, but it's interesting to see that even though we're quite stable with the triaging staying around 1100, 1200 unconfirmed reports, we do have that trend continuing down of reports getting closed. And it makes me think that we have to keep working at finding a balance between triaging while revisiting the very large backlog that we have. Those graphs might be a little bit misleading, might be a bit of a data science crime, but we're removing the zero on there and you'll see in the tool that actually a big space that I'm omitting there, but we're looking at trends here. And what I'd love to see is seeing that number of unconfirmed reports going down as well as the number of open reports, fixing things at the same time as we're triaging and we're getting that volume to a more manageable size. So hopefully we can reach a point when it has a kind of snowballing effect and we can make it easier to find duplicates in the database to reveal trends, identify priorities in development. And we need, as I said, to deal with new reports, that constant flow that's about 17 new reports a day over the last year, at the same time as we deal with historical reports that have been around for ages. So it makes me also think about what new actions we can bring in to promote that trend. And one idea would be to introduce some QA focus every week to make use of those meta bugs, those topics where we can work together as a team and try and review a large number of reports linked to the same topic, but also it's an opportunity to review the theme over the last few years and potentially inform development later on. So I'll move to the browser here to the tool, which you'll be able to have a look at later on if you wish. On this page, we've got how numbers progress over the years and one of the most important ones, we saw a little bit of it, is the progression of unconfirmed reports. So probably a better idea to focus on the right side of the graph because this is based on a bugsy lab dump that doesn't necessarily have all the reports from early years. But roughly from 2017, this is quite reliable. And we can see that very big increase in number of unconfirmed reports after COVID started and the effort to reduce that number, especially from the end of 2022. And corresponding to that below, we've got the trend in open reports. But as you've seen, this is a lot harder to see when you don't have that focus that zoom on that last part. But overall, it is nice to see that decrease in a number of open reports if we zoom in. And hopefully we can get that going, keep that going. Now you can play with this tool and focus on the area or the time period that you're interested in. For example, if you want to focus on the last year, the last 365 years, and yeah, sorry, yeah. So on those charts, no, but I'll move to a page where we can remove the enhancement requests question. And it would be interesting to see the proportion on those charts is true. In this other tab, we've got also the what I call the pulse, which is more about how many of something happens every day. Over the last year, we can see the change in the rate of reporting issues or opening reports, including enhancement requests and hovering around 17 reports a day, which is quite significant to deal with. But if we look at the trend over the last few years, again, we can see where that peak in reporting happened in 2020, around here. Sorry, that we can have a look at. I'll move to the snapshot tab. We've got a few colorful graphs. And here if we want to, we can remove the enhancement requests from those graphs. And just to answer that question, if we want to see that peak, we've got the first version affected here, which is I zoomed in a little bit. So it's a bit hard to see the numbers. But the peak would be around 7.0 here. So 7.0, 7.1. Well, we had quite a lot. But again, now, this is first version affected. This is data that keeps changing as we retest and we figure out when something started. But it is still an indicator. Now, with this chart, if you're interested in comparing as well what comes from before LibreOffice was forked, you can also include what was inherited from OpenOffice at the beginning here, which dwarves the other versions here. But this is perfectly expected. And this is not including the enhancement request as I turned that off just before. So a few more shots that I'll let you explore in your own time if you want about different fields and that are available in Baxila. We can clearly see, for example, in the components graph that writer is by far the most popular component was used followed by calc and impress. And you can also play with the setting of making categorized bar plot proportional to see how different categories compare without looking at absolute numbers, but looking at proportions. So as we expect, we'll see higher severity bug being the ratio of fixed by being higher compared to, for example, normal and minor. You'd expect also trivial to be more popular also for newcomers as easy hacks and a similar trend in priority. It's also interesting to see the proportion of issues that have been fixed as opposed to resolved for other reasons. So 72% of resolved issues or 73% are for other reasons than identified fixed. So in there we'll have quite a bit of works for me. We're not sure what exactly fixed it. But one of the biggest chunks here is duplication, which I'll mention later on as well. And looking back at the first version affected, we see as things settle over the years, again, we're looking at proportions here rather than absolute numbers. It's a bit hard to see on the screen here. But we see that usually we set all around 12% of issues that are not resolved, that are unresolved in older versions and obviously a lot more in more recent versions. So going back to my slides, this tool has, there's a few aspects to it, right? We just saw the chart just before. So what I want to do with the chart is inform and also help the team maybe react to trends. As we add more charts, we can maybe see that there's areas that we should focus on in QA. Or that we should split our attention between dealing with the backlog looking at older issues and try and get that combined effort going. But it's also useful to illustrate and communicate, illustrate achievements. It's very nice to see that trend going down when we work together on triaging bugs. Throughout the tool, there will be links to action or there is links to actions to straightaway go to bugzilla to find a list of bugs to triage or unconfined words or the ones that are tagged as need info, tags that need comment from the QA team. So we have one click away, something to work on, which can be handy for especially newcomers to get ideas on what to work. And I'll also show you in a minute the table that allows you to search for bug reports a bit differently. Maybe a little bit more comfortably than bugzilla that we all love. And so you can filter easily in that table. It's a good alternative, in my opinion, and also makes it easy to create lists exporting the data and potentially find that list again on bugzilla later on. But that table integrates also a kind of aggregate ranking that is calculated from a few different indicators. We already have in bugzilla the fields of importance or well, importance as a whole, but severity and priority that we tweak as we go. But we thought it was interesting to put together a kind of ranking that takes into account more factors and maybe does that a bit more automatically as an indicator, not necessarily an absolute truth of how important an issue is. But as an indicator, an extra indicator that could help us focus on some issues or even reconsider if some issues are important or not. So obviously, this is open to tweaking, but it's quite transparent on the tool as well. We've got this table of how much weight each one of the factors has. For example, number of duplicates or regression, is it a regression? And data loss are pretty high up in that way. Conceited by me and others as quite important, as well as accessibility issues. Whereas on the other end of the spectrum, you'd have number of comments and age of the bug report, not necessarily meaning that much. So getting a weighing that's much lower. One addition that was recent was, does the bug report has a link to a you to ask LibreOffice question, which is pretty important in that, I get to that, pretty important in that quite a few people will ask a question on that forum, but not necessarily go through the effort of reporting a bug. A lot of people find the platform quite daunting. Yeah, if you want feedback right away, sure. And number of duplicates is somehow related to the number of people in CC. So this is counting double. And not that I am saying that I disagree, but just notice. That's a really good point. To some extent, there will be people CCing themselves when they find the bug without reporting it. But it's overwhelmingly, I think it would be the case that it's a duplication there. And that we can review that and lower that for both. It's quite interesting. Yeah. Yeah. And related to that. To some extent, I was considering priority and severity to be, well, isn't that already a duplicate of this whole effort, right? So considering should we lower priority and severity as well, because they're already factors are already values that we've decided on based on looking at the rest. But yeah, yeah. Yeah, that makes sense. Sorry, say that again. It's an on the scale. So we extending we're doubling the scale. Yeah. Yeah, well, yeah. So go, well, mention that well, agreed with core that it's something probably needs to be done about the duplication between number of duplicates and people copied in CC. But also giving more importance to data loss issues. She's, in your opinion, more important than anything else in the table, right? Yes. Yeah. Okay. Well, thank you. Well, this is feedback that I'm taking on board. And this will always be to some extent subjective. And eventually what I'd like to do is maybe put it giving the option for the users of the tool to define their own ranking as they wish. But thank you for that that feedback. It's very valuable. So going back to the tool, let's have a look at this table just to illustrate it. So we do have a link to the bugzilla advanced search just in case the data is a little bit outdated. We can always go to that we can do a general search here. And I find the search quite handy if we want to do something quick with filters by going to in the summary looking for crashes or maybe something related to formulas in the component calc looking at fairly recent reports, and then sorting that by aggregate rank, which is that rating we were just talking about with the information displayed at the bottom and about the way in. And then if that list of Sorry. So it is based on a data dump that updates that I need to update. But it is from yesterday. Yeah. I do a lot of search in the bugzilla. And I want to see two problems. The one is the default setting is a time of one day. Sorry. The default time is one day. And that's often I forgot to change it. Yeah. It's nonsense to to look for duplicates in the one day. I've actually never noticed that I didn't. To me, my default, my default search would be search. Okay. All right. And the other problem is that some body is not unique. And there are some reports which use aberrations. And my wish would be when that report is settled to you or put to a metaphor that learning is repeated without aberration. And with inverting, which are usually used, for example, when the user writes, I cannot rotate the page as a picture. But we internally do not use a lot picture, but the word image. Yeah, we will not find this. Right. Yeah, that's a big challenge. So it would be good when this bug is handled by the QA. That would repeat it. In a way that the word image appears. Yeah. So the same for when someone use a CS, was it CS? It's a character type of CS I cannot search. Exactly. So there need to be a repeating report in words, which is beautiful for a search. Yeah. So there's Okay. Time flies. You tell me. Yeah. Does that work now? So it is my so Regina was talking about difficulties with the search on bugzilla. I agree that there's quite a few quirks and that for a good, a good, good query, we often have to think about variations in spelling about potential abbreviations. And then using the root of a word to have all the variations that correspond to it, it's quite tricky. And it gets some using to thinking of solutions. I'm wondering if this potential and maybe in the new version of bugzilla upcoming new version of bugzilla to have some terms that are automatically matched if they are such, obviously, for abbreviations that are a bit obscure. That's very hard to do. But yeah, yeah. All I say, please search for duplicates. Yeah. And we cannot find some duplicates. Because what he is saying, that does not appear in existing bug reports. So my guess is that when a bug report is handled, the problem should be described in other words, which make it a final one. Yeah. And that's kind of related. It's something that I not that longer we added to the wiki. We have a page that explains quite a few abbreviations. And I added a warning at the top that says carefully, abbreviations can be can feel like sometimes gatekeeping, but also just hard to to understand for many new contributors. And we therefore we recommend to spell them out when when practical. And but it's true that it's also the responsibility of the team to make sure that all the relevant terms are included in the comments when discussing the issue. That's a very good point. Thank you. Now with this table, we can grab whatever's in view in a few different formats. Fortunately, no ODS here. Sorry. Library limitation. But very easy to copy that and keep a record of your of your quick search here. And paste that in your in your spreadsheet to then work on on it later on. Also I say. And also to take a series of of bug numbers and find that again on bugzilla if need be. Because the search allows you to input a number of bug identifiers and get that list saved. Oh, if you need to. Now moving on and looking at the time. Other aspects of it will had recently a look at I had a recently a look at meta bugs. We've got quite a few of those categorizing bugs used to to put in that block field in reports to say that it talks about this particular topic or this particular aspect of the of the project. I've added a couple of visualizations to the tool, including a bubble chart that looks at the age of the bug of the of the tracking bug of the meta bug, as well as the fraction resolved. So higher up will be more higher percentage of resolution and lower would be lower percentage. This can be useful and we'll have a look at it in a second to pinpoint some areas of interest that maybe are a bit forgotten left to the side that doesn't attract much attention and potential and tender ideas for the future. Also added a network visualization. So we don't see the links on that bubble shot between the different ones, but in an alternate alternative visualization of those meta bugs, you can quickly see how they are linked to each other and have a better overview of how it affects the project. That's your question. So let's go to that one. So here we've got the size of the bubble that corresponds to the total dependencies. If you want to, you can switch it to just the open dependencies, see what's left to fix. The color corresponds to the fraction of regressions in the meta bug, which can be useful to pinpoint regression hotspots. And if you want to, to have a more interesting scale here, because obviously a lot of the yellow ones will be focused on regressions. For example, this one is fast parser regressions. This one is regressions dialogue tunneling. With the best guess, we can remove the regression only metas and also decide on a filter to have a minimum number of dependencies to filter out the really small ones, also a minimum age in weeks to see meta bugs that have matured over the years. And then visually we can also focus on one part of the graph, maybe below 50% resolution and see if there's any interesting topics in there. For example, undo redo has a lighter color, higher proportion of regressions, quite important in size, and quite low in proportion fixed. Similarly, we've got chat, the chat meta bug up there, that's in a similar position. And if you look around, you'll find that there's also chat enhancements at the bottom, obviously, enhancements, enhancement meta bugs will be quite low. That's expected. So that's something to explore. Again, we've got a table at the bottom. If we want to filter or to have a look at what we filtered already, search for meta bugs and also use again that aggregate rate ranking. But in this case, it's an average of all the dependencies. And doubling up at the top, we've got accessibility, and a lot of also docx meta bugs in here. Finally, that network visualization I just mentioned is right next to it in here. The main take home message here is that we've got too many meta bugs. And it's a big mess. But while physics are sorting themselves out in the middle, you can have a look at what's around maybe this potential meta bugs in here that are often for no good reason and could be linked to something else. You can see little clusters of meta bugs that are not necessarily connected to the rest. For example, trying to find something interesting here. Well, or you can focus on a specific meta bug by clicking on it and getting two degrees of separation with other meta bugs and see how those ones interact with arrows to see which one blocks which one. There's a lot in there. So to find your way, you can also use the dropdown. Do a quick, for example, chart. Press enter and highlight that chart meta bug in here. And see how it's related to others. And as you hover over it, you've got the link that takes you straight to that bug, the description, the number of reports and the percent resolved on there. Now I am very close. I always finish with not enough time, but I just wanted to finish wrapping up with how this experience tells me that there's really a need for an integrated effort. And that's, in my opinion, what will work best to keep moving forward with such a large amount, a large volume of reports linked to such a successful and large project like LibreOffice. In QA specifically, I like the focus on catching duplicates early so we can reduce that kind of wasted effort by contributors and catching regressions early for very obvious reasons to get it fixed as soon as possible while still easy to get fixed and while the contributors still active. But around QA, there's a lot of work that goes on. And I want to highlight those links, for example, with the design and UX team, in particular with Haiko working on the backlog really hand in hand with the QA team using our skills in different ways. Obviously, testing is a very important topic and needs to be continuously looked at, including why not asking for newcomers to implement the test as a compulsory, easy hack on top of other maybe more interesting activities. Documentation and recording things in release notes is also extremely important for people to focus on new features and get those ones ready for release. But also good documentation means that there's less misunderstanding and potentially bug reports that are about not understanding the feature rather than an actual issue. Linking with Ask LibreOffice, our Q&A website, because as I mentioned before, a lot of people will necessarily report something on Bugzilla, but we'll ask questions and active localization because we also get issues reported about missing strings. But last but not least, that's a big one here. Sorry, maybe a little bit violent, but a welcoming, diverse community fostering an enjoyable environment. That's also how we keep people involved, keep people happy and interested in gaining those new skills alongside us. Thanks for joining today. Hopefully you thought that was interesting. You've got the link with the QR code to the tool. Very happy to hear what you think, your ideas. And again, as always, thank you all for your contribution in that space. It's much appreciated. Thanks.