 Dwi'n gweithio i gwaith i'r iawn i chi. mae'n bwysig i'r gwaith i chi pwysig i chi i chi i chi i chi i chi i chi i chi i chi i chi i chi i chi i chi, mae eich bwysigio ac yn hemau, na'n credu o'n meddwl i chi i chi o'ch gwaith o ran five o'r ddwi'n merwg, mae gennym ni'n ceisio ddim ei ddweud, yna'r cheisio hwnna fydd. Diolch yn dda, mae gennym'n reddli, rydw i fe wnaeth hyn ysglwyddiad, Let's go through status, plans, help needed, oh hell yes there is help needed, and if you've got any discussions and ideas, yes please talk to us if you're here or on ISE if you're not, and we'll try and relay questions. Please take notes in goby someone, I will endeavour to send out notes after this, it may take a week or so. So, WN web team, I have commit access but I don't pretend to have any great power or control over the web team. Does anybody else here have anything more? We're still using CVS despite the fact people have been talking for what a decade or so about moving away from it. I would love it if we moved over to Git, there's been discussion about that several times over the years. There's been some pushback in that A, people are scared that it's going to be a big repo and that's going to put people off downloading it and working with it. I don't believe that to be a major problem at all personally, but I'm not in the situation, I'm not one of those people who might be put off clearly. The other thing is that the Git workflow it is claimed may put off some of the more casual contributors. I'm not equally not all that convinced that CVS is a wonderful tool for those people either, but it's difficult to know, I don't tend to talk to these people very often. I hope other people may have more insight. Is the mic on? Yes. Rhonda's done a conversion of the website to Git, but I'm not sure if it's updated or it's certainly not synced with the website. How big was it at the time? I have no idea. My own personal thing is I used to be the CVS maintainer, it's horrible, I got away from it and that made me very happy. I haven't had to use CVS for anything else in five years. Every time that I come across doing stuff for the Debian website, and believe me I don't do much, it's typically tweaking pages to do with CDs. All of my finger memory has gone away and I end up having to fight with it. The biggest problem with the conversion to Git is that the current translation system relies on the CVS ID numbers. Oh God. CVS revision numbers, so you'd have to switch to Git commit IDs I guess. Of course they're not going to be. Of course I've seen this in other situations, they're not monotonically increasing so it's not as easy to work with. Yes, and the other thing is that for some pages there's a chain of translation. The original might be in Polish and then an English version gets translated and then that gets translated to German and Japanese and whatnot. So it seems a bit complicated and so we'd probably have to invent a different way of doing that. A different workflow? Yes, probably not involving commit IDs. One thing that could help with that is converting to something that we can do normal get text files. I'm not sure if that's possible with WL, anyone know? I don't see why not, but we don't have DNS here unfortunately. Anything else people would like to talk about about the website stuff? That's my pet peeve out the way. The reason I did the boff here together for the two groups is to not just take up too much time. I guess everything's perfect, more people will be here, torches and pitchforks. Is anyone on IOC watching interested? There are a few people on IOC but they're not saying anything. I guess we'll move on. This could be a short boff. The wiki, which is the main thing that I'm interested in, I'll be honest. Paul and I are the two active wiki maintainers in terms of we have shell access on to the wiki server and we can do everything on it just about. Several other people have super user access within Moin itself, which means they can do things like set ACLs, they can go and delete spam, that kind of thing easily. More support, more help is always useful. Believe me, like every team, we have many, many other things that we want to do if only we have more time to do it. And the individual people are busy doing all the tasks too, but fine. I did a check last night and we have currently have 12,203 pages in the wiki. Of course, we probably have a couple more since then. That's a scary number, it really is. Most of them are out of date. This does not count spam pages. There are quite a lot more pages that did exist but have been deleted since. This is system info, so it only checks things that are currently valid. The number that scared me even more was the system seemed to tell me we've got 12,500 registered user accounts. Now, we did have about 3,500, 4,000 more spammers which have been disabled. That number seemed very high, but I couldn't find anything that told me that I counted it wrong. We're currently running a mostly clean version 1.9.4. We have a few patches that mostly that I've come up with for helping with spam. We're about to move to 1.9.7 probably today. Again, still with a couple of local patches. I'm going to jinx things and say, I think we solved the spam problem. Three years ago when I really started getting involved with the wiki stuff, we had a horrendous problem of people just posting all kinds of crap all over the wiki like they do. We've gone through a whole slew of different approaches to stopping spam. The main one is you must now be a registered user to be able to edit pages at all. To create a user account these days, shockingly, you've got to be able to register with an email address. Still something that's missing in Moin. The patch that I have to add this is something I'm hoping to get pushed up again this week. I've been talking with Upstream. Even forcing people to have working email doesn't stop the spammers. You'd be amazed how many we get, still trying to get through. The second part of the spam stuff is basically I've come up with a big set of heuristics and encoded them in a pull script. Whenever anybody tries to sign up for an account in the Wiki, we have a script that basically has historical knowledge of what looks good, what looks bad. It's not quite Bayesian, but it's approaching it these days. So, as a rule of thumb, if you try and sign up for a Wiki account and you're coming from hotmail, it's not going to work. Because to a first approximation, 99.99999% of hotmail sign-ups are spammers. Don't get me wrong, it's not just hotmail. If you're using a Gmail address, it's unlikely that you will be able to create an account straight away, unless you look particularly non-spammy. Again, all of the free email providers are too easy for spammers to sign up with. Paul, we have comments. Oh, yes. Hi. Do you have to be a registered Debian developer to get a Debian Wiki user account? No, not at all. Just because I'm a noob, so I'm not a registered Debian developer. Sure. And we have a question at the back as well. I'm new to Debian. Could you explain to me how you're making sure that the user looks less spammy? Sure. A typical spammer, in fact, I'll show you in a moment. I've got a demo of the system I currently use to look at it. A typical spammer sign-up will have some random sequence of alpha numerics at hotmail.com for an email address. For the user account name, we'll have four or five digits or capital letters as an email address, so as a user account with no overlap between the two. They will be coming from as often as not a random Chinese mobile IP, or they'll be coming from a known spam haven. I'll list a few of those in a moment as examples. If you're coming from one of those, then the anti-spam script will basically, it gives them all scores, and if your score is above quite a low number, your total score, then you just get told no. If you try to come from the same IP address too quickly, say in less than two or three days, and you try three or four times with, again, obviously ridiculously spammy things, we blacklist the IP and say, sorry, that's it, you're not getting in. Blacklisting the IP isn't just a, you can't sign up. It's a, well, you're clearly not interested in Debian, we blacklist it, you don't get to see the Debian Wiki anymore. Going beyond that, if there are obvious patterns, and we see it all the time, were a whole slew of addresses from the same network provider, all in the space of one overnight session, I've seen some say a couple of hundred attempts from loads of things in the same slash 24 or something, we will just block the slash 24. If I go, let's see if I can make this fit. Basically, I have almost a console running on the wiki, and I'm not on the network, of course, am I? Bear with me. Are we using catches? I don't remember something catching or something. We've tried using captures in the past for a short period, on a mean short, like a few weeks, it helped. There are so many problems with using, especially recapture, blind and partially sighted users are screwed over. Privacy violations. Privacy violations, there are so many problems with it. To be honest, the biggest problem is it doesn't work. We actually found it didn't help for more than a few weeks. The spammers have already solved the recapture problem, their problem with it. They either have systems that manage it automatically, or for pennies, they get people in the third world or whatever to just sign up and do them for them. The fact that we're already seeing people coming from Hotmail, Gmail, Yahoo, where you've already got to have done a capture to get the email address in the first place means that people have solved this problem. We can slow them down very small amount, but unfortunately it just had no effect. It did legitimately annoy a number of people who wanted to be contributors. Again, I added a recapture patch for Moin, which I'm pushing upstream. In case other people want it, I wrote it and then turned it off. Steve, on that front, I talked to the entry and said he sent you a script so that we can whitelist tour, is it not? Yes. It would be great to integrate that at some point. That would be lovely. So that we don't ban people who like to be anonymous. Yeah. Again, that's something I'll come back to in a moment. If you have a look, I have a simple screen session that's running on wiki.debian.org. I don't know how visible that is. Yeah. And then it's basically going to go away. Is that more readable? That's about as big as I can do. You will see, basically, I have a monitor script that runs, and just for my information, basically goes every couple of minutes. It checks through the last so many days' worth of account sign-up attempts. I've got something that will give me the IP address that something came from a call. Of course, it's got no reverse DNS. This is one at the top. We have, as I was saying earlier, we have random looking attempts. So with this person has tried to sign up twice, or this IP address, has tried to sign up twice with two entirely arbitrary looking stupid names, with two also entirely stupid looking random hotmail accounts. They're not getting in. So we've denied both of those. The average score that we've given them is 29, just based on the information that that's given us. A score of 29 is much more than the threshold of 10. They're not getting in. It's as simple as that. They've only attempted twice. If this same IP address had tried a third one with yet another username and yet another email address, the system will actually say, you're clearly irredeemably a spammer, go away and will block the IP address. The next one down looks maybe more believable in that the email addresses are at least vaguely consistent, but still Yandex is a well-known Russian search engine and female provider. Again, they've got stupid names. It's not a guarantee of spam. None of this is. A lot of it comes down to judgment, but I'm not about to block them. If they try again with an average score of 28 and higher, again, they'll get blocked. Yesterday I wanted to sign up for the wiki to edit for the wine and cheese party thing. I have a live.in account, simply because it was easy for me to get that handle. I did not receive a confirmation email. How do you deal with people who are genuine but still have emails coming in from live? It is a very good question. When you try to sign up, if you have been blocked, you will actually see it's not as obvious as it should be. I know that this is a problem. After you do the register account, you should be told at the very top of the next page in a banner, sorry there was an error. If that error persists and it will give you an error code, the error code is purely and simply it's the number 900 with your spam possibility score added to it. If you get an error code saying, number 914, please try again, if this continues, mail debbyn.org, then that's what is meant to do. It's not perfect and for people who don't speak English and we have quite a number of those obviously, it's not great. It's the best thing we can do for false positives. If you mail debbyn.org or wiki at debbyn.org and say, I've tried to get in and I've got this error message, we do it every day, we can whitelist the email address and then the next time you try and sign up, it will let you in. I don't want to spend, I've spent too long on this already, because frankly, spam is horrible, I hate it. I'd actually, and I'm not kidding, see it as a capital crime. But the point of the anti-spam stuff isn't the point of a boff, it's not the point of the wiki. The point of the wiki is the content and that's why we want to get shot at the crap. Perhaps. We have a couple of comments from the gobby. Yes. Debbyn deserves an archer's or Gentoo's grade wiki. That's one of the comments. There's an idea to freeze the wiki per release and copy on right to the new version. I've added an alternative idea to that, to add some macros allowing pages to know what the current release is and show the stable release. That's pretty easy. You've done that. I've done a couple of things so that you can show the name of the current stable release. Probably not that hard to extend that. Or to have something like that. Our wiki is very large. We have a huge amount of data. Like any wiki out there on the net, probably half of it is outdated or crap. I wish that was an exaggeration, but it's probably an underestimate. The Gentoo wiki was great, as I understand, and then they moved site, decided that they wanted to start again and it never quite happened. A lot of the content just never reappeared. The Archwiki is absolutely awesome. They've got a great community of people who keep on posting vast amounts of really good documentation. Some of it is arch-specific obviously and doesn't necessarily translate. But half the time if I go looking to solve a problem, I've got a laptop configuration issue or whatever, I do a search and the Archwiki is at the top of any good content but good links as to where to find more information. Yes, it's great. The Debian wiki is also great in some respects, but it often comes up in good searches. Unfortunately quite often with the same outdated information that I know because I wrote it a few years ago, that kind of thing. Hell, the internet's like that. I haven't, no, not yet. I know they're using media wiki or something that looks very like media wiki. One of the issues with the different wiki engines is they all have very, very different spam setups and very different spam solutions. Moin is very much our preferred setup. We have this something that happens every couple of years. We have the boff like this. People ask about, well, should we use a different wiki engine? And the answer is possibly but who's going to do it? And the big problem with switching wiki engines is then migrating content or not and then you end up with an empty wiki and nobody uses it. Sorry. So if someone was interested in doing at least a test case of media wiki migration, that would, I just asked this in the IRC, but that would require having an actual copy of the media wiki. Obviously the GitHub, or the Git that you can get from the teams slash Wiki is just the scripts. Is there a way to get a copy of the entire wiki? We can happily give you a toggle dump of the entire wiki. We have done to a couple of people in the last few years. Yes. In addition, I wrote a patch for Moin so that we can have a daily generated offline copy in HTML format for people who are in remote locations when we read the Wiki. Unfortunately, that hasn't seen much traction upstream. Steve, do you have any way to get people to review my patch? It's been reviewed a few times and I've fixed it up. Moin upstream are very helpful. Thomas in particular is really cool. A lot of us is overworked. Best thing to do with the Moin people, to be honest, is talk to them on IRC and push the patch into the upstream Moin wiki as a patch for a review. I've done both of those things. Ping again, they get busy. Again, they don't tend to do that much by email. It tends to be IRC or wiki itself. Thomas is good enough that he's interested in the patches that we have for Moin 1.9 for the recapture and the mail verification. I've just been too rubbish at factoring them and dealing with his review comments which are at least six months old, possibly 12. Again, it's on my list to do this week. Other things that we do have, and again, a wiki is all about the content, forget, the engine is frankly irrelevant in the long run. The main thing that you need is good content. And good content of course is correct up to date and maintained. I do have the first cut of a script that will walk through a Moin and will go looking for pages that are out of date. We could start maybe using it, pick out the users who have contributed to pages previously and say after six months, if pages are tagged appropriately, say you suggested that this page reviewing every six months, please check and update it. I think that's a useful thing and I've never seen any wiki do that. I don't say I've ever played with every wiki on the planet, there's far too many clearly, but I think that will be a useful thing to have, I just haven't rolled it out and really tested it yet. There are issues of course that people doing trivial spelling tweaks of course will reset the counter. Equally, the people doing the trivial spelling tweaks will then be the last people who touched a page so well that they will then get pestered about the content and say you're clearly the expert about the details of the kernel driver for this USB webcam, please tell us more about it, please check it's correct when all they've done is they've added a link to the Russian translation or something. It's not perfect, but I think it'll be a good start. Yes, it is. Do you know how many people use it? Approximately zero. Hello? What about using templates? Have we explored that? Basically bootstrap new pages with templates for common... Moin does have a template feature and a bunch of pages already use that. So there's like 20 or 30 templates already. Depends on what kind of page you want to create. For certain things, say announcing that you're going to do a BSP exactly or whatever. We have a really good template for that already so you don't need to fill in all the details yourself you can just fill in and it generates the rest. Again, it comes down to people using it and people maintaining those. If you have ideas for things you think we're missing, please chat. That's about it for what I had to talk about really. We are always looking for more people to help, as I said. Especially with the bugs. Especially with the bugs. Because they're kind of piling up and aren't being looked at. Oh, yes. There is wikidocdebyn.org BTS category. We're not very good at tracking bugs there necessarily. We do see them. Honest. We can mention some of the special features that we have on Devian Wiki. Go ahead. Do you want to come up? One of the things is that when you add a link to the Devian bug tracker we have some JavaScript to go and look up the status of the bug and alter the CSS on the page so that it's more obvious what the status is. And then you can hover over it and see the title and all that sort of stuff. Also the wiki looks up daily what the current Debian release is. And then you can say what's the code name for the stable release and there's a macro for that. And there's another one for release dates and for version numbers. There's one more macro for linking to message IDs so we'll just link to the Debian lists URL for the message ID. There's also a convention for if you're working on stuff in the wiki and you want it to stay around because it's linked elsewhere there is also the category permalink which we use to make sure that say especially later if people have got we've mailed out press releases and stuff that link to things in the wiki that those pages don't go away when people are finding them in the archives later that kind of thing. So one suggestion from ILC is that a couple of people are suggesting they could do a www slash wiki bug squashing party remotely at some point to try and get through some bugs and things like that. That would be cool. There is going to be a bit of a step change coming of course and I should have mentioned it earlier and it's just remembered one thing that we don't have yet for the wiki is single sign on. So if you're a Debian developer or an AliF user or anyone any other random person everybody at the moment has their own separate wiki account frankly that's crap and single sign on is mostly working it's still in flux for some of the Debian web space we're planning on looking into single sign on real soon now we have been for a while the only reason it hasn't happened yet is down to manpower and other people doing other sites first with a single sign on so if you have a Debian wiki account and you're a DD then expect at some point in the near-ish future you'll probably get an email to say stop using that account or we will somehow find a way to link that account to the single sign on system it's a bit in flux at the moment and that's exactly something that would work very well at a sprint or a BSP type setup on another topic I'm not sure if anyone here has other languages that they can speak and write but the wiki it is possible to do translations and the way it's currently set up is you add the language code to the start of the page name with a slash after it that's not the best way to do it and it'd be great if we had a different way maybe with get text files or something but it'd be great if someone could research that if you're interested certain of the system pages in Moin because this is provided by Moin will automatically we direct you to the appropriate page say if you log in and your machine is set up and it can recognise your browser says it prefers French you won't get sent to front page you'll go to the French equivalent I don't know why I picked French because that's the one I can never remember the front page translation but for example I've tested this myself it's maddening because once you've told your browser it knows about all the languages trying to turn that off again and then I spend the next day struggling to redutch or something can you speak a little bit about the performance and infrastructure of the wiki sure the wiki itself we used to run on dedicated hardware that frankly was a cast off from other uses over time as we started getting more and more page views the thing I should have done as well was actually go and grab the stats for that we realised that actually it wasn't going to work that well we've since moved over to a dedicated VM provided by DSA AMD64 with huge amounts of memory so basically it runs something silly like 1632 threads of WSGI running moin it seems to work okay we have found we are one of the biggest moin using sites on the planet we did find that we had a major issue there was a major performance problem with page saves not that long ago so you could happily read the wiki pages but saving something could take literally a couple of minutes and that turned out to be it was a scalability it was a design flaw in the way that moin notifications are done when you go to save a page there isn't a link from that page to each of the users who is known to have to be wanting to watch it to get notified when it changes so instead it was walking through the 12,000 user accounts looking in every single user password file to see does this person want notifying about this page that doesn't scale very well so there was a patch already that the known wiki people had come up with I tweaked it slightly we reviewed it backwards and forwards a bit I think it's actually gone into moin for 198 which is due to be released soon and it went down from literally a couple of minutes to maybe two or three seconds which is fine a lot of people may not have noticed it but if you went to a bug squashing party and the convention is that at a BSP you will fill in details about all the bugs that you've fixed so you've got competition if you've got BSPs in two locations or you can just point people at it later and say yay look what we did if you've got 12 people around a table all of them working on bugs and going to update their list oh my god that was painful so I got shouted at a lot at a BSP and that was what prompted me to go and fix it I'm not aware of any other major performance issues at the moment we did have a fairly well publicised security breach a few years ago which was due to a bug upstream in one of the drawing plugins and that was fixed and that caused reset one of the flip sides of that one of the downsides was many many of our older accounts were set up by people who had never put in email addresses because you didn't have to when they first signed up so a lot of those people we had to of course reset all the accounts or disable all of them force people to do user account password changes for those people who didn't have working email there is no way of doing an automatic password reset through Moin because it sends you an email so if you're one of those people and you couldn't log in anymore sorry talk to us the great thing about that security hall and attack was that we adjusted configuration so that they ran as a different user and they can't modify the file exactly, the default setup of Moin is not very pretty in terms of security it's simple it works but it does end up with all of the data and potentially a lot of the Moin code itself owned by the same user we've explicitly set up a privilege separation for the WMWC to get around that and we worked with the DSA folks for quite a while to get that working and they were awesome about it basically it was just over Christmas that year and we ended up spending a lot of time overnight if I remember correctly installing a new machine, we migrated all the pages it was horrible but hey sometimes it happens so I've been wibbling far too much in here so please, does anybody else have anything? Hello I'd like to I've got a question which is about the website the previous topic each time developer commits a change on the website there is 4 hours there is a script to rebuild why not just each commit for example I think Neil has an answer for that so when I update update bits of the website for the press releases just updating the news section where all the press releases go takes about 10 minutes or so so basically it's just huge the actual size of the site to generate it from WML and all the languages we have it's about a 540 megabyte repository so it's basically the time just to generate all that HTML is sort of the issue there for pages where we know that they shouldn't be affecting very much it is, as I understand it, it is possible to just rebuild small bits of the website but it's always potentially risky the way that WML lets you cross link things and essentially hash include bits and pieces and use macros all over the place that's what it boils down to it's not pretty, it's not great I'm not aware of a better solution I think we are just about done so oh no, or more currently it's with CVS if I will understand this because there is a hook which is used for translation or something like that is it possible to use more get text and profiles to fix some of this problem there was some discussion on IC about this apparently that PO4A supports WML so we could possibly do that and at the moment there are some parts of the website that are using get text but not the whole thing so there would have to be a conversion and that could be totally manual I would think but I'm not entirely sure the biggest problem with a lot of the ideas that people have spoken about like to get migration and whatever we are struggling to find the people bandwidth to get through a lot of this too more volunteers needed as always and if you have any questions about the website or the wiki Steve and I here we are both here all week come and ask us and we are both on the best list and of course I should have put this down but we didn't we were at list.debian.org or wiki.debian.org is the best place to get in touch with not just us but with all the other people involved as well and well thanks for coming hopefully that was useful I don't feel it's convenient because like it's really a lot of it's fun and it's only a few weeks I don't remember how it was what is it I don't know I've been using it I've been using it I've been using it I've been using it I've been using it I've been using it I've been using it I've been using it I've been using it I've been using it I've been using it I've been using it You are using it Yes I am I'm visiting the!!!!!! It's time to get started See you are going to очень!!!! Thank you very much gyda hyvä yn gyfwscar. Wrth o, gallu'n gwneud, rydych chi'n edrych yn ein bod gyfwcfadau. Tuhl, o'n eich mae'r gyfan, yn gofod? Beth yn gwneud ac dim bydd yn gwneud a'r rhywbeth? Mae'n cyfwc, ond mae'n byd yn cyfwc. Mae wedi'u hwpin, mae'n first ffoget. Mae'n rhywbeth eich bod o'r cyfan. Mae'n gwneud unrhyw表us? Mae Fyloedd ychydig. Cnow ddim are ddweud o'r cyflosio o Feitio. Well, I don't think this one's the weird one, because it's American yet. Oh, yeah, well I just went to Malaysia to get a dish for, you know, a 3D gift, which is like $1. No, but Brazil is expensive, isn't it? People are like random, shitty, maybe like $15, which is about $7-8. That's expensive. Yeah, right, exactly. But that's not a shitty place. Here, I could eat anywhere for less than $10, but, you know, what do you think? Yeah, if you're already here, yeah. Oh, and if you want to go on a nicer restaurant, you will not spend less than $2. Okay, great places to go for a drink with a nice romantic rhythm. Yeah, I know that Brazil, they're not like home foods. Are you going to go eat? Yeah. Now, people actually, I didn't pass by the dorms first. The dorms? Yeah, well actually, that's a food. I need to check it out, because I'm going to say to the point that I want to make sure that I can. I just want to make sure that, like, I have a reservation because, like, I'm not on the list for food. Like, I don't hang with my meals, I can work out all the goods. All the goods? Yeah, this is like a piece of Hollywood. And that's where I talk in. That's not the problem, you can't take in, but it gets hot. Oh, like if you're already home. Yeah, there's always a reason. But it's fine if you didn't do it. Oh, it's like a great thing. Wow. This is a... Oh, if you were talking about shopping, this is the more main, this is the more central. Yeah. It's here, and you can have two seats. Shit. Yeah, yeah. I just thought that was fun. It was good, because, like, I like my tennis shoes. Like, a little running shoes. But she liked it. Yeah, yeah. Like, if you need to get something like that, is that a cure? Really? Yeah, like, I bought... When I got to buy a MacBook, I bought it at a hardware store. I think I paid $200 or less. Because New York has so much. Like, I wouldn't pay with a fast book. Yeah, oh, you should have bought it again. Yeah, it's so expensive.