 Hi everyone, this is my talk on replacing extensions about me. I live in New Jersey for a long time. I lived in New York, but we moved last year. I'm the author or co-author of about 20 media wiki extensions. The number keeps growing, but my most important one is the first one I created, which is called Page Forms. I'm also the founder of WikiWorks, which has existed since 2009. That's a media wiki consulting company. And I wrote a book called Working with Media Wiki. Oh, there it is. Peter's holding it up, which is the only media wiki book that's been published as far as I know since 2010. And it came out in 2012. And I've updated it a few times since then. And I now work at a company called Genesis. Since 2017, there have been people from Genesis at previous conferences. They use media wiki. And they're trying to spread it further within the company. Yeah, so I'm still at WikiWorks, but I'm not doing day-to-day stuff there anymore. OK, we're heading right. OK, let's jump right into it. I want to talk about replacing extensions, when it makes sense to create alternatives for extensions, when it's happened before, and that whole idea. So the first case study is an extension called Flag Revs, which allows for displaying a revision other than the latest one to users. And on Wikipedia, it's known as Pending Changes. It's very complicated. It has a lot of what I think are unnecessary features. There's actually a way to review for each revision on different scales of quality, whether how good it is. And then you can have a consensus to decide what should be the revision that's shown to users. Approved Revs, I created this extension. It's the first extension I created that duplicated another extension that wasn't really doing anything new. In theory, it's very simple. There's just one approved revision per page, or at most. Whoever is an administrator can change that and set a different one to be the approved one. It's somewhat complicated, and it gets more complicated as new features get added. But I'm trying to keep the complexity manageable, and Flag Revs serves as a warning beacon or something, I guess, to make sure it doesn't get too complex. And you can see it at that URL. This might be the closest. I don't know how many people actually still use Flag Revs. This might be the closest one of my examples to a true replacement in that Flag Revs seems to have just fallen off the radar. I know some people are still using it, but it was supposed to be heavily used on Wikipedia, and it's sort of dropped off. Nuke is an interesting one. I think a lot of people still use it. Does anybody here use Nuke? OK, a few people. Yeah, OK. It lets you delete a large group of pages, and the way you do it is just say this user or this IP address is clearly a vandal. So if they just created 200 pages, then delete all of them by clicking on one button. I don't think it's that helpful. People may disagree. In my experience, spam, maybe it's changed in the last five years, but it seems to be that once spammers know about your Wiki, they're going to hit it with a lot of different newly created user accounts. It's like create account spam, create a new account spam, et cetera, every five seconds or something. And the same goes for they cycle through IP addresses also, or maybe just to get around protections like Nuke. So I don't think it's that helpful. And you can see, actually, on the Nuke Talk page, some people say, I accidentally entered the username of an actual user in Nuke. How do I get back the 500 pages I just deleted? Smitespam is an extension I oversaw during a Google Summer of Code project, I think two years ago, or I think two years ago, two and a half years ago. It doesn't go by IP address or username, although you can put those in a whitelist saying, you know, don't look at these. But basically it just looks at the text of pages. I mean, one thing that's common to a ton of media wiki spam is it doesn't actually use wiki text or very little, just enough to put in a link or something, but there's no section headers, internal links, anything else. So it's not, you know, it doesn't do neural networks or any fancy AI stuff, but even the simple algorithms it uses tend to be pretty good at identifying spam if you've just been hit by a lot of it. I don't know how much this gets used, but personally I think it's a much better option than Nuke or that kind of approach. So yeah, if you're interested in, this is really just a plug for Smitespam, but if you're interested in it, you can get it at that URL. Semantic media wiki. OK, this is probably going to be my most controversial slide, I'm sure, from more than even the one where I needlessly slagged UML yesterday. So what does it do? You guys all know this, but it's used for storage querying, browsing, and visualization of data in the wiki. It's a good extension, and I really don't want to say too much about this. I'm sort of including it here just for completeness. I personally, this is just my personal opinion, think it's a little overly complicated to install and set up data structures and query, because you have to learn a whole custom query language. And if we have time at the end, maybe this is worth discussing, maybe not. I don't know. The alternative is Cargo, an extension I was the original author for, and I'm still the main developer. It's one extension instead of up to 15 if you really use all the Semantic Media Wiki family. The big advantage is that it uses standard database tables instead of providing its own custom storage system. So that just makes a lot of things easier in the process, less code and less custom storage and querying happening. And there's the URL for it. Pony Doc, this is really another kind of trivial one, because I'm almost sure no one here has used it or heard of it. But it's used to, it's interesting in concept to let you document software and hardware. So if you have a lot of pages about different products and different versions and so forth, then you can create a hierarchy of those. And of course, if you're already using SMW or Cargo and Pageforms, then you can just have a structure just using those. But it does a few nice things, and I wish I had included a screenshot, like linking to the same documentation for different versions, that kind of thing. It's all stuff that can be done probably in a custom way, but it is convenient in some ways. I also include some access control for different products, that kind of thing. It's poorly implemented. I'm sorry if any Pony Docs developers are watching this at any point. The people who did it clearly, that's the one media-weakie extension they ever created. It is used, the reason I know about it, and I mention it is because it's used at my company, Genesis, but we're trying to move away from it. So I've created a competitor, Docs Extension, which doesn't have a name yet, but hopefully that will get released sometime this year, and start to get used to replace Pony Docs. Collection, this one is interesting. I don't know if anyone here uses it. It lets you create PDF files out of multiple Weki pages, and it was developed by a company called Wikipedia Press, I think, who make books like Travel Guides and so forth that are basically just a collection of Wikipedia articles on that subject all put together. And for a long time, it was the only PDF solution, as far as if you want to make a book or whatever else. And that was pretty much it. It's hard to set up. I don't know if anyone here has tried using it. Yeah, OK, would you agree that it's not ideal? It's hard to set up? OK. Yeah, and it has dependencies and so forth. There is an alternative just released last month, and I know about it because it was developed by WikiWorks for NATO, actually. The impetus for it was that they needed formatting that collection doesn't provide, which is not hard to do, because collection doesn't really provide any formatting. So it's called DocBook Export, and not PDF Export, because it saves it into DocBook Format, which is a standard format used for books and documents. I'm more familiar with a format called Lattec, but DocBook is simpler and I guess more popular for books and stuff like that. So what it does is it takes the HTML from a page, unless the media Wiki parts are due the Wiki text to HTML, then it uses this thing called Pan Doc, turns it into DocBook, and then turns it into PDF. And what that means is that that process allows it to do a lot of customization. You can do indexing and include figures, and of course you have a whole lot of styling options, fonts, and everything else. And I think it's pretty easy to install. I mean, you need to set up Pan Doc and these other utilities that are used for these different steps, but it seems to be, I mean, I got it working, and I still haven't been able to get a visual editor installed, so for what it's worth. Oh yeah, sorry, question. Do you use the word? Yeah, maybe I should have shown an example. Basically, there's just a parser function called, I think it's called DocBook, where you just specify these are all the Wiki pages I want in this order, and then you specify some other stuff, like use this to create the index and that kind of thing. Oh, I've just been locked out of your laptop, Brian, sorry. So anyway, so it's just a parser function, so you can just keep adding to it, and actually every time somebody goes to get the PDF at the moment anyway, it newly generates the PDF, so there's never a concern about caching or anything. There probably should be some caching, but anyway, yeah, it's easy to change. It's easier than also to do that than with collection, where you have to, there's a whole structure you have to create of books, define a book, and so forth. Yeah, that's done outside of the Wiki. I'm new to all of this stuff, so I might not be totally right, but DocBook uses something called XSL style sheets, which is sort of like CSS, but for DocBook instead of for HTML. So it comes with some built-in XSL, but you can go into the server and put in your own style sheets, XSL style sheets instead, and then it'll just work as far as I know. Yeah, oh, with collection? Oh, yeah, oh, I forgot, that's another big advantage of, I should have just made my talk about this. Yeah, as far as I know, collection uses the Wiki text in order to create the PDFs, which means parser functions in general won't work, and that's especially true with queries, SMW queries. If you wanna show a table or a map or something, it just won't, I don't know exactly what it does, it just won't show it, it just shows the, it just, yeah, it just prints the Wiki text. Yeah, that's another huge advantage of DocBook export because it takes the HTML of the Wiki page instead of the Wiki text, then what you see is what you get, that's what you get the output. Oh yeah, I'm Mark. How does this compare with what Wiki? Never heard of it, I don't know. Oh, okay, well, great, yeah, if, yeah, okay, if there's gonna be, if anybody didn't hear that question, there's a new extension coming called ElectronStack, is that an extension or what? Elect, okay. It's not called ElectronStack. Okay, oh, Electron. It's under development, it uses Electron. Oh, I see, Electron is some third-party thing. Okay, I see, but can it take multiple Wiki pages? That's the part they're working on. I see, okay, then, then, sorry. Oh, oh, is this better than the Electron thing? I don't know, I guess we'll have to see it. I'm just looking at what you're saying about, you know, yeah, that's right, yeah, that's right. I'm guessing that they have something more like collection data, which was this page. So that, you know, the UI there is a little better. Yeah, there's advantages and disadvantages to doing that. The advantage with a parser function is that anyone can change it, and you know, it's not admin-only, that kind of thing. And it's easier to make, to modify the extension, but that's not relevant. Okay, Visual Editor, speaking of Visual Editor, I just mentioned this. WizzyWig, it does WizzyWig editing of Wiki pages and it's been discussed here before. It works great, I can't, you know, some people say it's slow, it can be slow. It seems to, the speed seems to be a function of like the square of the size of the page or something. It's like, I guess, exponentially slower as the page gets bigger, as far as I can tell. So it's, but the main problem I see with it is just that it's hard to install it. And again, I've never been able to install it. And it requires parsoid, I don't know, that's misspelled. And, yeah, I mean, that's the thing. And it doesn't work with page forms, although that is going to change in the next few months. I'm pretty sure. But still, that installation is a real problem, in my opinion. So, this is very new. This just came out a week ago, although it's been in the works for a long time, for over a year. Mostly developed by another guy, I don't know if I should mention his name, although it's there on the Wiki page. A guy named Duncan Crane in England, whom I've been in close contact with. And so I've helped out to some extent. It's not a replacement in that it will never be as good at editing Wiki pages with all the edge cases and so forth as visual editor, but it's just incredibly easy to install compared to visual editor. It's just JavaScript, you just install it and it works. And it works, oh, the slide doesn't actually say that, but it does work in page forms already. And there's the URL and here's a little screenshot of the output, which is basically exactly what you'd expect from a WYSIWYG editor. That's it, yeah, good. I don't know actually what would happen if you tried using them together. Each one of them creates a new tab. So in theory, you could have three editing tabs for a single page. Yeah, yeah, yeah, sorry. So if it was strict, then... Yes, you can do that. Can you do that? Because that seems like it would be useful that way, you know? Well, the way it works is pages that are form editable will just have the edit with form tab. And there, within the form, you can specify for each text area, I want tiny mce here, et cetera. Pages that are not form editable will get the separate tiny mce editing tab. And what I'm saying is, if a page is form editable, you won't see that tab, so it's always off. Oh, I see, pages that aren't form editable. I'm not sure actually. I see what you're saying. I'm not sure, but it should be easy to add that setting if it's not there. Right, right, yeah, yeah, that is an option. Although if you already have visual editor installed, then if and when visual editor support comes in for page forms, then you might as well use it for everything. Sure, but I'm looking for a quick page. Okay, well, all right. All right, here's an interesting one. I think this is my last one. An extension called flow until what, two months ago or something, and then they decided to change it to structured discussions. It offers a structured layout of talk pages. So instead of a talk page, this is the top page for the main homepage on mediawiki.org, there's tabbing and all that stuff. You can see there, I don't know if you can see the search interface, but yeah, there's just a bunch of things, features there. And notice that it's a relatively small column inside the overall page, and there's a big sidebar that's reserved for just description and stuff like that. Oh, okay. I think personally it's as clunky as its new name. I don't know if there's anyone here who's involved in the name change, but that's just horrible. I mean, I can't imagine anyone saying structured discussions over and over when talking about this extension. I like flow. I think flow is a great. Oh, I see flow, not flow in quotes, not the name, but the thing. I don't know. I like the name. It grew on me. Yeah, no, no, no, yeah. Echo and flow sort of came together. I don't know. I like the whole, the sound of it. Maybe, yeah, maybe it sounds better to me now after seeing the alternative. Yeah, I mean, it's very hard. I'm certainly not the only person who's said this. It's very hard to follow long discussions on it because it's, in large part, because it has big fonts, lots of white space, and has this thing, infinite scrolling, which I was never a fan of, even though it's become really popular on the web. Yeah, and there was a plan to implement it on Wikipedia and there was a big user revolt and that's not something new, but with new extensions. But anyway, if other people have different opinions on flow-structured discussions, I'd be glad to hear them. This is just my personal opinion again. Yeah. I like the integration of that here. I like the way the support dance works. Yeah, yeah, yeah, yeah. Extensions that integrate with Echo are great. I don't really... Yeah, sure. Yeah, notifications are great. It would be good if more extensions integrated with Echo. Yeah, it just, infinite scrolling just makes it, I think it's hard. It makes it hard to find anything that was a year ago or something. Right, right, right. So this is the one use case that I wasn't involved in developing in any way, but there's an extension developed at MITRE called Comment Streams, which I think is better. It's more configurable, it's less intrusive. Yeah, it works more like the Facebook style comments that you see at the bottom of pages, sometimes, of Wiki pages sometimes. And you can, in fact, include it. The bottom page is not just in the talk page. And there's the URL for it. Sorry? And it's integrated with Echo. Okay. But... It's soon visual editor, hopefully. Sorry? It's soon visual editor. Oh, and soon Comment Streams will be integrated with visual editor. Yeah, and yeah, stick around for Friday afternoon if you wanna see something more about that. So yeah, I mean, these are just disparate examples. I've tried to find a thesis behind this whole thing. The question is, for anyone who is thinking about creating an extension or hiring someone or telling someone, asking their employees to create an extension or something, the question is, when is creating an alternative justified? When should you bother if there already is an extension that nominally does what you want it to do? Often, it's when the current solution is too complicated. So yeah, well, for three of these seven cases, the alternative extension does less, not more. Often, I think that's the way to go is simplifying things versus adding more features. Four of these are maintained by the Wikimedia Foundation for what it's worth. That may just be more a function that their extensions are always the standard to go by. Let's see, are there any other? This is sort of an open question. Are there other candidates for replacement? And actually thinking about this made me think about flow again, besides this should really be another possible alternative to flow slash structured discussion slash comment streams. And this is really just an open question, just something I've been thinking about. A lot of the point of these is that it's just hard for users to learn the syntax. You have to figure out how to indent it to exactly the right number of spots, and then you have to put in the four tildes to make your signature. So this is just a random mock-up I put together. This is from the Wikimedia Talk page on Houston. And the idea is you just have a little bit of JavaScript. It says, you can't even read that. It says, add reply here. You just a little bit of JavaScript that puts in arrows. And so you can click on any of those and it would pop up a window that just lets you put in your wiki text, and it'll take care of the signature and the indenting and everything. Just signature and indenting, I guess. And it might validate your wiki text. So yeah, it's just something to think about. I don't know if anyone has any thoughts on that. Of course, there's always a need for it to have a true commenting system, but this could potentially be a lightweight approach to that same thing, especially if you already have a lot of content in talk pages and you just want to make things easier for your users. And is that it? Yeah, I think so. That's it. Any questions, comments, but I guess PDF Explorer was the big issue with anything. Yeah. Yeah, man. Oh yeah, microphones. Yoan, thanks for that. In terms of semantic media wiki and the forms and the cargo. Yeah. I mean, is there a way of combining the both of them rather than these sort of two separate extensions? Is there, it just seems a shame that we've got the, they're not competing, but it seems a shame we've got these two extensions that seem to be sort of doing some of the same job for each other. Yeah. The question is just semantic media wiki and cargo. Can they just be friends or something? I mean, there are what I would say is, there are cases of people using the two of them together and you certainly can use the two, not together, but on the same wiki and people have done that. They can't really work together. They each have their own storage system and they can't read each other's data. So that's pretty much it. But certainly cargo, I mean, yeah, I mean cargo is basically just the look and feel of it was pretty much stolen from semantic media wiki. I mean, I just copied all of that. So it's not, I don't see it as, I don't know where that's going really, but it's not, I'm not trying to reinvent the wheel. I guess I did reinvent the wheel, I don't know. It's another thing. Okay. I guess all the main questions were about PDF export. Yeah. I agree with you about flow in terms of a light-rate JavaScript thing with reply buttons to hide the syntax sugar over the indentation thing, basically. Sounds like a great idea to me. I've kind of thought of similar things at times too, like it just seems like a very nice in-between of old talk pages versus full-on flow or liquid threads or whatever. I don't know, people should make it. Cool. Yeah, that's really good to hear. That's interesting. Yeah. So I was wondering about the doc book export because I wrote the HTML to wiki export using Pandoc as well. So I don't know if you know anything about the HTML to wiki extension, but I was curious, when I saw Pandoc and all the formats that it converts, I was thinking about making an extension that has a universal format converter and importing. Yeah, Pandoc is incredible and it seems almost wasteful to just use one little part of it. It even actually does wiki text. It does anything to anything, like any of these four things you can convert to one another using Pandoc, as far as I know. Yeah, yeah, yeah. The feasibility is there, right? Right, yeah, it would be great to see. Swiss Army knife of... Sure, yeah, yeah, absolutely. By the way, I didn't really explain it, but it's an open source utility that you can just download it. You said it converts from any... Does it convert from PDF to wiki text? I don't know, I don't know, but I wouldn't be surprised. No, I don't think so. Okay, well, because why? Because if it can go from PDF to something that's just text-based, it can do wiki text. I don't know, maybe someone should look it up, but I can't imagine it would work that well, but that's a whole separate question. Probably, I don't know, tables would get lost and all this stuff like that. Oh, yeah, sure, yeah, yeah. PDF is... Yeah, but yeah, but back to that question while Cindy gets the mic. Yeah, I think it would be great to have more extensions or maybe just a Pandoc extension that really opens all this functionality. We did have an issue once at MITRE where we needed to generate a wiki from a Word document and we had somebody who wrote a bunch of scripts using Pandoc that converted from a Word document to wiki text, so it's theoretically, at least. Yeah, that's good. Pandoc gives an example on their page of... From Pandoc? Yeah. Oh, Pandoc. I don't know from PDF, sorry, from PDF to later. Yeah, okay. So maybe. It's, we're half the way there or 80% of the way there, so. Right, yeah, well, then Word to wiki text, so that's probably, yeah, that's probably something. Pandoc, it was Pandoc. It was Scripts to Invoke Pandoc. What I'm saying is it wasn't an extension but there's no reason that if it's possible from Scripts that it couldn't be packaged as an extension. And the cool thing was is that it actually took all of the images and uploaded them as files, too. Okay, thank you.