 Okay, cool. Hello everybody. Again, I've ended up running a webteam buff. I try not to claim I'm part of the webteam, but I guess that's kind of failing these days. We don't have that many of us actually here. I guess Rhondda and I are the only two I'd actually recognised directly as a webteam, but there's lots of other people around. Laura is hoping to join in remotely, I understand, I hope. But we'll see how we go. It's more a case of updating on what we've managed to do in the last year or so and what we want to do next. We always have discussions ongoing about potential redesign work. We have, in fact, managed to migrate out of CVS, more on that in a moment. And there's always stuff that I think we should be talking about in terms of content. So very, very definitely this is a boff. Although I have some slides, I have like five of them. I'm not going to be stood here talking for 45 minutes. This is a discussion session. Please dive in. Definitely, definitely, if some kind people could jump in and take notes in Gobi as well, that would be lovely. I will do my usual thing of making sure that notes are, you know, a summary of the session is sent out afterwards to the lists. So, you know, there's no point in having a session like this at Debcon for then not sharing what's happened. So, CVS migration. We actually managed it after about how long? Many years? 10, 15? I don't know. Years worth of talking about moving out of CVS. What actually triggered us to really, really, really do at this time was Alioth going away. We weren't going to have a CVS server anymore, which would have made the website hard. We spoke about this last year at some length about plans and what we might do. It went kind of like we planned, but also not really. It was quite involved. So, Loa did a huge amount of prep work with some help from others in terms of actually migrating, physically just migrating the data and the history. We have a very, very large repository full of lots and lots and lots of small files, and that means it takes forever just going through, tracking all the revision history. We have a large number of translation scripts which track, or translation tools to help us not directly translating themselves, but to keep on top of the status of the various translated versions of the pages and that kind of thing. That took quite a lot of effort to go through. We did have the beginnings of a VCS common tooling before for when somebody had previously tried to move us to subversion, but that was never finished. So, I got involved and helped to hack on these tools to move over from a set of things that directly looked at CVS metadata on disk to things that actually ended up calling Git and understanding how Git worked instead. That was a good, about three weeks' worth of effort, and I'm glad it's done. Or at least I thought it was done. I was two weeks in and thought, yeah, it's all finished. Why doesn't anything build anymore? And then I had surprise number one of a series. I didn't, hands up here, who knows, or who knows how WML works? I don't believe you. Yes. So I was horrified because it came as a total surprise that WML, you can just include arbitrary pull in your pages and the output from that pull can then generate code, can generate HTML output, whatever, can do all kinds of things. So I thought I was finished because I found a set of top level scripts and, yeah, I fixed all those, they all worked great. Now what? Oh. So then it's a case of, oh, things are going a bit slowly because of course, unlike when you run a translation update checker at the top, you can spend 10 seconds, 20 seconds working out the state of Git before you start, run across 20,000 files and it's fine. Your cost of doing that calculation up front is really, it's amortised well, it's quite cheap. Now imagine that of course you have 20,000 or so individual invocations of WML when you type make it the top level. Having each one of those spending 20 seconds working out the state of the world before doing its thing is not really ideal. So I ended up, we went through quite a few ways, or I went through quite a few ways of improving caching. I tried adding a SQL-like database to cache the state of the Git revisions. That was worse than not having a cache at all, which surprised me. What I did end up going for, in fact, is just a really, really simple on this cache basically to track the ordering of Git revisions. That might sound like a pointless thing to do but what we used to have was fairly obvious, fairly easy. If you have CVS revision 1.21, you know just from comparing digits that 1.22 is probably newer. If you have Git revision A, B, C, D, E, you don't know that if B, C, D, E, F is newer or older just by looking at it. So hence I actually ended up generating something where we dump the ordering of the Git revisions for each file into a cache so then we can do a cheap look-up later. It's a weird thing, but meh, I was quite happy it works. Adding caching and whatever actually, scarily, we seem to be faster than with CVS. A really, really cute thing and I'm over the moon about and I hope it's not just a flash in the pan. Laura, when she added the new privacy policy pages last week, or was it earlier this week rather, timed how long it took to build, rebuild the entire website afterwards. So that was a change that affected every single page on the website because there was a link in the footer. Previously when she's done that kind of thing, she reckons it would take about 24 hours to build everything from scratch. With the new code and the caching I added and whatever, apparently it's down to about eight hours. I still think that's horrifically slow, but it's a factor of three. I'll take that. What we've come out with in the end is mostly similar workflow for what we used to have before the Git migration. There's one unfortunate change to, we have a tool called Smart Change, which allows you to verify that if you're doing a programmatic change, for example, if you're updating a link across multiple different translated versions of a page, then you could actually call this Smart Change script and you could make your changes to the English version and all of the others at the same time. You could apply a regular expression and so on and then make sure that all of the translations that were previously up to date would still be up to date afterwards without having to go and annoy the translators for essentially a no change change. Due to the way Git works with revisions, we can't do it in quite the same way, but we have, I think, an equivalent workflow. It's a little bit more involved, but it works and it's documented. Most of the rest of what we have is similar. If you go and have a look at the translator dashboards, previously we used to have direct links to CVS diffs from Alioth. We now don't have direct links to Git diffs from Salsa because it's actually significantly more effort to do those diffs in Git and the Salsa admins have basically looked at what we were doing and we coiled in hope. Please don't do that. Because the moment you end up with a crawler or something working across our translation dashboards, it would end up bringing Salsa to its knees with thousands of diffs in parallel. We've disabled that instead now in those dashboards. We just give the translators a Git diff command directly in the page that they can run in our local checkout to get the same information. It's not quite as friendly, but it'll have to do for now. We may find a better solution in the future. Performance, I've touched on. We're okay with performance for now. I found a few bugs in terms of what we already have that were things were working by accident or working. It actually meant they were broken but nobody had noticed. Moving on to new tooling underneath, I was finding quite a lot of anomalies which they'd been there all along. My tools were finding them and nobody else had. But I want to move on. The key point of the VCS migration is it's happened. We don't need to do it again. The whole point of it is it will enable us to do the things that people were scared of even contemplating before. I have a few things which I want to do with the Debian website. The prospect of fighting to do a change across lots of pages using a prehistoric version control system was just so daunting that there were lots of changes that have not happened and were never going to happen. Hopefully we can unblock some of those. Design work. We've already had a session this week. Thomas ran a discussion earlier about updating the front page, making some changes there, and then not just the front page, but moving on to some of the top-level pages on the website to make them easier for people to use. That's not easier for Debian developers who very rarely visit the front page because we know where we're going already. But at the moment we have a front page that is really, really not attractive to new users or to people who don't know what Debian is, our front page isn't going to tell them. We want to improve on that. There's proposals, and I'm sure we'll hear more about that soon. My pet project, oddly because I'm in the images team, I want to improve our download page. Sorry, not page, pages. Many, many different pages that describe how to download Debian images and what to do with them. I don't want to go up to date. I couldn't tell you how many there are that are not because they're scattered and lots of them are not very good because we've got this whole profusion of changes that have gone in over time. I'd like to tidy it up and actually come up with a significantly better cleaner page. I even opened a bug two years ago about this. I might get back to it now. There are lots and lots of other things. People would like to come up with a pretty new look. I don't know. What other kinds of things would people here like to change about the website? Do you now feel empowered to go and do it? I don't know, I'll talk at once. One thing that did come out and I'm really happy the other day was a trivial change on the front page was the very first word and the very first paragraph was Debian. It was a link back to the page it was on, which is utterly pointless. Noel, in the session earlier this week, actually posted a merge request which I just hit merge and it worked. Lawyers then had to go round and update the translation pages. If we can get to the stage where small changes are really, really easy to accept and also bigger changes are actually possible to contemplate, I'd love us to be more agile, but maybe more daring, but make more changes more rapidly and actually see some proper innovation and all of those good synergy words. Let's see what we can do to make our website better now we can. One of my pet peeves and again the download is an example of this is I think we have too much content on the website. Good content is great, but content that's been added once 10 years ago and is clearly bit-watted and nobody is maintaining it is worse than not having it in the first place in my opinion. If we're giving users bad advice about what we're doing or how to use Debian or how to use their computers with Debian I think that's not doing them a great service at all. It's also if we have so much content it then gets daunting to those of us who do want to make changes. If you're not sure exactly where to make a change and make it stick and make something consistent and improve things it can be very, very difficult to I guess to get the motivation to actually those changes in the first place. We were talking about maybe splitting the repositories last year and splitting things out of the website. We didn't get there yet. I think it's still something we might do. It's now significant easier to do if people have ideas on what we should be doing there. Again, talk to us. But essentially, discuss what would people like to see, what would people like to work on? Or is it all perfect already and we should go home? I would like to clean up the main web page and maybe the second level during the Debian website buff I showed. Maybe we can show this again because then we have also this on the video. I just created an example of how the homepage could look like and that's what I like to work on. I start with also creating merge requests and maybe see if I can get right access later if it's needed. The easiest way to go right now for that kind of thing and I'm glad it's possible is do a merge request, post a link to that merge request along with say a screenshot showing what your new page looks like as well is probably the most effective way of convincing people that what you're suggesting makes sense and that it looks good. Does that make sense? In terms of actually merging changes and whatever, a merge request is good. We will absolutely care about updating translations. One of the things that has happened in fact just during debconf is that we used to have a staging version of the website that we could build things on to be able to do examples. So we could check before and after and that's just been we set up by Pabs and Laura has been playing with it too. That would be a really, really useful thing to do. If we could even set things up with CI, a shocking idea we could maybe have multiple staging sites and we could rebuild automatically and just click from one to another and see how things compare. If you want to like what you proposed I think it would be good to split it up into several smaller chunks because you sort of went over the whole page and if that's just in one single change set then it might be making more difficult to discuss the different parts of it. I will start with the short and simple ones. Maybe the question is I made some suggestions to remove larger parts. How is the process if this will be discussed in the web team? If I make a merge request should we discuss one or two things right now? Sure, go ahead. Maybe you could open I can give you the link on IRC of this room. One part is that I've added a part Y Debian. This is just an example and I just copied from other web pages from other Linux distributions and I would like to have other people that are better in wording so I think it's very important to show the first time user of our homepage why he or she should use Debian. Why is Debian different to other distributions and what are we proud of? This was an idea to have it in the very top just like an eye catcher so maybe just three to five items what should we list there. I think in the website both some people said that's a very good idea but now we need content and not just the copies from others. Then the next part getting started a lot of sentences there, long sentences and I also said sentences are bad, just have a list of items with links on it and less links or less information there I completely the next thing is the news I only showed three and also the sentences below the news were removed by just two items and also you removed the section which was just a huge list of security. So are there any objections concerning removing the security? For example this would be a very simple one or should there be some more discussion what about these several items I showed? I can understand with what was brought up with respect to the security parts that the short parts and the names might not be very well chosen like it's always just saying security advisory and the package name which might be better to have a more useful short description of what the issue with the packages in that space but that's something different one thing that I don't agree with is that we should go away with the security updates at all because of showing that we are completely buggy because it also shows that we are caring for the security and one of the parts in our social contract is that we are not hiding problems to put it down and then it doesn't need to be in the front page on that ground I agree with but not hide it completely because we have a dedicated system it has a cash now in front so it should take some load and it's up to date to the minute nobody needs to be we could just have a top level and we move everything there we can link to it on the front page if we want to say we can check the latest updates here job done As a medium term goal like to reduce duplication because one of the big problems we have with the whole Debian web site collection is that it is very very redundant you find stuff on the home page, on the wiki, on dedicated pages just a really new thing is on the dotnet by some developer that got nowhere but you do need to look at it you have packages, you have three alternative versions you have tracker and so on so I think what we should do as a web team is look at the things from a customer perspective and put that on the home page and that is the whole home page no docs, no security stuff no translation of these things that create a huge amount of work but make a static collection of what a 90% customer base comes looking for the whole developer thing because people who are developers don't look into there so either do something which is relevant for a newbie to learn packaging and here we would first probably need one reference workflow because we are Debian so we have 20 one reference workflow and document that and that does not necessarily have to be on the main www, it can be on whatever learnpackaging.debian.org or so and the developer manual and the policy and stuff like that probably should be on the site where Debian developers look at and that can nicely be linked from the home page in case people want to drill deeper and forgot the link or so but don't scare people with it that are interested in using Debian so on this example I also removed the whole block with like 30 to 50 links where we repeated the menu, the four menus about Debian getting Debian support and developers corner and then the first command on this was oh if we remove this whole links then we cannot or we cannot access them but if you go on the second level of the pages this block is repeated so I would say remove it very early from the web page because it's still on the second level of the pages. There were a few of these links that need to stay for example our privacy policy we need to keep picking a new example but I mean keeping this as a block at the bottom of the page rather than the top I think we all agreed the other day makes a lot of sense and definitely things like picking the languages that you're using rather than have a large lump of screen real estate taken up with that that should be a drop down box and then it fits significantly more easily at the bottom lots of these things could just make it easier it would be really nice I think you're echoing something I was saying earlier to actually remove a lot of our duplicated content in particular because there are so many versions of things it's impossible for some people to find the canonical one you don't know which one is right and if you're looking at a page that is maybe 15 years old in some cases it may well be telling you things that are so out of date they're actually dangerous now but we don't know and because the website is so big and to be honest the web team is not that big it's very difficult for a small team to actually do that effectively I think it's time we should start just slashing away at the things that actually start doing sprints to just remove content or not suddenly remove but move off into its own area to say for example here is all our security advisories that doesn't need necessarily need to be alongside the main web repository it doesn't need to be on the main website directly in your face on the front page the thing that did come out the other day was for example on news mentions a load of releases which are not all that frequent doesn't mention we're having a big conference this week which is actually possibly it's the biggest event that happens to Debbie in every year apart from maybe a new major release we could do with making this more visual we need to make it more friendly to the people who were not us used to a command line and happy to hack on the HTML we need to actually make it welcoming this is kind of the same problem in the publicity team because there's like many channels which they play it's only like two people that basically do most of the work with a lot of support from translators and so on kind of grinding the axe is something they don't do because they have to chop wood and I think here this has acquired cruft for a decade or more in the same way and I would really suggest do a fresh start do a competition or something like that for a set of students how would you want to kind of make a Debian web page look, kind of the landing page look in this world where it should be working on portal devices and stuff like that as well and then fit the content we have in there maybe we need something new and the rest I would just put onto an archive just as we archived three quarters of alias and it's just there in case somebody needs it clearly obviously archived and to say this is not current this is here for history right exactly and it's either interesting to historians which we have some in the team or we may dare to just throw it away after another decade and say like history and gone so I'm keen on something Jonathan brought up earlier in the week on actually we need to work out our requirements rather than just necessarily go off and just do lots of different things if we actually can come together and agree on what we want the website to do and this is a project wide discussion probably not everyone is going to care but a lot more people than just the dozen people in this room might be interested if we can actually come up with clean requirements we can then actually start developing we can even put out tenders to commercial design folks we don't necessarily want to be doing that we want to also encourage new free software, new free design people to get involved and see what they can do and then definitely yes fit our content into a new design would be really important but I don't think we should be going off half cocked and just saying let's trash this, trash that without actually having a proper holistic goal actually working out what we want to do One question that was brought up on IRC was related to JavaScript if it's still considered a devil to some extent I would say it was never really considered that evil people use it for evil things and especially we still like to have an accessible website so JavaScript for the deviant website should be something that is assisting but not the sole purpose of how you deal with the site so if the site still works without JavaScript and it's accessible and people can get through it then yes if JavaScript can help with some things I don't see the big objection to you that way That absolutely echoes my preferences as well so I would love to suggest that we actually get together and have a web team sprint at some point to actually work through some of these things to go through start driving the process of a new design of a significant re-architecting of not necessarily the technology that we're using but definitely actually gather the requirements on the rest of the project and start coming up with proposals and driving this to make the next generation of deviant.org Actually I very much like the suggestion from Daniel about deduplicating of content because especially it was in the context of the security advisories and getting the security advisories on the website always was kind of pain it worked most of the times but when it didn't work it was quite painful to fix the things so it was quite annoying both from the security team and from the website team and it required additional effort from the security team which shouldn't be needed so if parts of this would be visible on the security tracker like the content of the mails that they send out to the mailing list sort of just link to that and get rid of it of the part that is in the deviant website and we have many many many thousands of advisories which are taking up time building websites I have some numbers we have like 52 WML files and 20 thousands of them are dsa minus WML and an old security advisory is historic it's history, it's not useful today so that's why I said there is security tracker.debian.org link to it and it's fine I wouldn't even honestly include any type of content because this is the main web page, this is the landing page and Debian is a secure distribution in our marketing blurb on the homepage and if you want to verify that or want to have details or so somewhere will be a link to the security team somewhere will be a link to the security tracker and that's it I think the whole architecture of this should be that if you sit down, have half an hour's time you have navigated the whole homepage and a web team considers the essential information on Debian the canonical quality insured information on Debian either in text because it's a primer for the common use cases or as links as a curated list of links of where to find more stuff and that's it and everything else that this thing does which is as Thomas said is just wrong to do these things there's monolithic, we have this one homepage and everything integrates in it's thing has just died a silent death long time ago and we should just follow suit and if we get rid of that we probably are down to less than one hour for total repeal from now eight hours and actually doing that is painful when we come to do a point release the actual number of changes is half a dozen places and then it's walk away and wait for the rebuild to happen if please God can we get away from that and improve this it would improve the lives of a lot of people so I'm hearing a lot of agreement does anybody disagree with what we're suggesting about slashing a lot of this stuff so who would like to get involved in a web team sprint to make a lot of these things happen I just want to put something on the record which Laura said in the RSC is the problem of whatever we move from the website to other places is that we lose translations which are currently French, Russian, Swedish, Danish and sometimes other languages they are doing a great work a security tracker is only in English I think that's a valid point we need to discuss what is required content in a language I think if you run the Debian distribution in any type of production environment and don't speak enough English to understand a security advisory you are in really big trouble because if you wait for a translator to make that available to you in Portuguese you are just not able to fulfil your job as operating and infrastructure security so I think we have to have these in the back of minds it's different for we want to be able to make the distribution as accessible as possible to a wide variety of skill levels and so this type of information I think it's very worth translating into as many languages as possible as part of the outreach Let's make the important stuff accessible a lot of the translations for the security advisories are almost entirely mechanical as part of the work I've been doing I've found a whole load of helper pearl scripts that copy advisory for example which copies an existing English advisory and there are various alterations to that which are German, Russian, Swedish which just have a large set of regular expressions which replace certain English phrases with a translated phrase Obviously people are then going through and tweaking those afterwards but if we're at the stage of most of the translation that's going on is automated because the English itself is very much the same words over and over again we've got to come up with a better way than this surely I would much rather have our translator spending the time on improving the installation guide or how to engage with Debian is how to get involved than just doing that kind of dry work When I looked at the content of the homepage and the next level I found a lot of texts where I thought no one really read this during the last ten years because it was very stupid text or I did not have any additional information and that's what I wanted with the cleanup so also just read the text that somebody wrote very long time ago if it's really a useful information for the user It goes through a lot of our documentation and this is beyond the website our installation manual I was called upon before the stretch release to go and improve things for the arm ports and I was finding lots and lots of discussions in there about floppies and USB zip drives which haven't been relevant for a decade because we have such a large body of text again it's daunting it puts people off actually working on them and keeping them up to date if we can slim things down and remove a lot of the duplication actually come up with instead of quantity just a high quantity of docs all over get a smaller set of better docs please let's go for it and just to iterate that again I recommend do it the way around like what is needed from a user customer of the web page perspective and then copy over what you need what you already have and don't need to create a new and throw away everything else in the sense of archiving it if somebody complains afterwards we can always update that we can always grab things back if we need it's just that if we go through a whole body and try to reduce we are arguing with every little point that was work for people in the past we have people that translated it we don't want to devalue their work by throwing away all of the hours of translation work it helps us is like how does www.debian.org is relevant, get relevant again to users and that's the question we need to ask and the rest is just trying to fit in what we have I think we've brought the in agreement do you want to mic over here? from translator build translator WML is so harmful it's hard to check the original message and then update and review is so hard I hope I want to use the get text file PO file for translate maybe it means the throw away the WML and to use another markup language like Debian policy used the SGML file and now used the restructural text and I hope it's more comfortable to do translation work with the Debian website I understand totally even just doing English, WML is not welcoming sort of the development of WML stopped last century 15 years ago at least it's sort of a stale thing I can understand some of the attraction with the use of macros and things but it's great if you're wanting to produce a very large amount of automated stuff the thing is that if you're actually considering to not having everything in there but being more open to have the data where it's originates from like the security information security data minus maybe the issue why we didn't want to move too much stuff to the wiki because there is not the translation framework and things like that but that aside if we are putting more, having more data in the place where it originates from then we don't need the scripts port them and duplicate the data even in an automated way so then we more or less might also be open to embrace a new engine to generate the content for the Debian website the main part is to separate the design from the content so that it enables translation on a more easier way and that not everyone has to be that aware of how to not fuck up with some tags here and there but in the end if there is something new which would be up to the job I think the web team would be willing to investigate because now with Git underneath and having a VCS that actually supports branches in a useful way then we have the possibilities to try it out we've got a lot more freedom to try new things now so if you have any things or if anyone out there has anything because I'm going to be writing this up as well we'll be calling for suggestions on things we could do things we could use whether that's technology, designs, whatever we used the last five minutes to collect a list of the things we would like to do or ask for in the goby because then you have an easier job what comes to my mind is get together in a sprint collect the relevant information from a website user perspective because that's the input so that's number one somebody please jot it down number two is where do we kidnap ourselves a new user? maybe we can invite somebody who has some website structure background design background not in an artistic way but in a content way to help us do that the second thing is possibly if you want to invite a larger community to come up with designs for a modernised design for a web page then if we relaunch it it also looks new and the third thing would be if we have to look from a technology perspective what would we actually love to have in the back end if we have a green field approach and possibly replace WebML with something that makes translator's lives easier and maybe our lives easier in some other ways so that would be kind of the three things that I think could be to do shot and I think that wraps us up nicely so les is any last comments I think we're done thank you very much everyone as I said I will send a follow up summary about this and let's see if we can get something organised to make this a better place for us all thanks folks