 All right. Well, let's go. Hi, everyone. My name is Brian. I work on the security team. I'm going to talk to you all about security for when writing gadgets and media with extensions. We're going to do this in a slightly old-fashioned way, because I don't have a connector to my laptop for my presentation. What do you need? A thunderbolt to HDMI. A thunderbolt to HDMI and PGA. All right. Cool. Is there an even-perfect session? I haven't set one up. I don't know. I don't think. I'll see if there are things. If not, we get an experiment with how good a presenter I am. Yeah, I don't. That's not... Is it actual? Yeah, no, I need an actual HDMI. No, this will be fun and exciting. How many presentations have you been to so far that you've used a whiteboard? All righty. So, yeah, in terms of security, the first thing to talk about is, why do we care? What are we trying to do? So, the main thing we're trying to... Intuitively, it kind of makes sense. We want to make sure it's not used improperly. When we're talking about gadgets, usually what we want is to make sure the gadget can't be used in a way other than what it's meant to be used or can't be used to trick the user into doing something that they shouldn't be used or they can't be used to take over another person's account. With extensions, we also want to make sure it can't be used to take over the server or stuff like that. So, in the gadget case, the most important thing, not important, the most prime thing to worry about is cross-state scripting as that's usually in JavaScript code where the problem is. So usually this is one... a user input is put directly into HTML without escaping it or anything. So, for example, if you have something like... Is this actually readable? No. I don't have a darker color. Okay, that's better. So when you do document it, for example, you get an element, right? So this is a common idiom for adding something to the page, right? You do document.getElement by ID for some ID and then .innerHTML. I don't know if that's capitalized. Plus equals and then something like, let's say... Let's say that. So, for example, this would get some ID element, then you'd add high and page name after it. This would be a security problem because the person could go to a page name with some sort of malicious characters in it and then it might get interpreted weirdly. So in the page name case, it's not that much evil you could do because page names aren't allowed to contain less than sign or greater than signs. But you could, for example, if it hadn't and sign it like if it was... How do people do this one? I was like, oh, this is actually kind of hard. If you had something like that in it, that would get turned into an arrow, which is unlike... The bigger thing is when you take a variable which isn't a page name that can't have less than signs in it. So, for example, if you're looking, say, at the URL arguments, document.location, you could insert it and the user might have something like it, for example. It had something like that where it's taking something from the URL. The user might say... put it to a URL that's like mypage.com slash script and then do evil stuff, right? So, yeah, so we want to prevent that. And the key way to prevent that is to make sure that we escape all user input. So there's a number of ways to do that. In MediaWiki, usually we do use the MediaWiki nw.html.escape. So that will replace all the less than signs with and less than LT, semicolon, entity reference. And that makes sure it's safe to put in the document so no HTML can be executed. This is really important because otherwise, for example, a different person could trick the victim into going to a web page with that in the URL and essentially totally take over the person's account, cause them to edit other things or trick them into revealing their password or whatnot. Escaping is one method, an even better method which doesn't always apply, but sometimes is to use APIs that don't take HTML in the first place. So, for example, with jQuery, often you could kind of construct trees of HTML and then use the .txt method where it doesn't take HTML at all, for example. So, for example, something like this is very safe because it's creating a div and then it's putting this text in the middle of it. And if you supply HTML to this text function, it doesn't execute it. So, where possible, it's much better to do this method because it's much less hard to scrub, like it's basically impossible to mix up with other method. It's very easy to accidentally forget to escape something or do something and then you're kind of screwed. So, yeah. Does that make sense? Any questions? Let's see. So, moving onwards, that's largely what you have to worry about with gadgets. With the MediaWiki extensions, there's a lot more you have to worry about using... You still have to worry about this, but you also have to worry about SQL injections and other types of things. The key point is to more or less make sure you always use the appropriate escaping function and then it's safe. So, on the PHP side, you usually use the HTML special character function to escape your HTML. Let's see. So, for example, something you often see is if you have, say, a parser tag, you might see something like that in your parser tag, which would be unsafe because this WF message function, when it's set to text, returns the text unescaped. And instead, you would normally say text escape to make sure that it's escaping HTML or dash pars. Let's see. So, in terms of that, another common issue is process request forgery when you're talking about PHP extensions. So, say you had an extension that has some sort of form on it, like it's a special page or whatnot. So, normally you do the form and then you process the input, right? Generally, you need to have it token there because other sites can also submit to your form. So, in web browsers, we have something called the same origin policy which ensures that different websites can't read the content of other websites and other domains, which protects a lot of things, but they are sort of like a file permission where you have write and execute but not read. So, they can execute scripts from other domains and they can write to, they can send posts or get messages to other domains. So, if you have a special page and your special page in your extension takes some input, fills out a form or whatever, you need to ensure that other websites can't just post to that form and pretend to be the user who's viewing it. So, instead we add a little token that's an opaque value because other websites can write to other domains but they can't read from it so they won't know what the token is. So, for example, say we have one website where we're like, we send a message to make a name. So, if we have a page like this called make a name to say make someone an administrator, if all we do is take the username, then my other evil website could have a link to this page with this username and then either redirect users to it or trick the user into clicking on it. Then, since that user is already logged in, the request will go through and it will act as if they did it even though they didn't realize it because they're on a totally separate domain. So, instead, what we do is the form where in the correct usage of it has a hidden input tag named wp and token. So, all the forms on Wikimedia websites have this edit token flag. I'm sure if you've used the API you've often had to fetch an edit token and it can be frustrating when you're new to using the API to figure out how to fetch the token. So, yeah, it's a token like this. The value is a randomly generated value and the main property is that it changes all the time and each user has a different one so that nobody but you can know a token for yourself. So, this gets submitted with the form and then when it goes to the special make action and then it includes a token. So, now that it includes the token, the special page has to check that that token belongs to the current user logged in to ensure that it's not some other website trying to trick a user into going here when they didn't actually mean to. So, it uses the user. So, in a special page you'd use the dollar, this, the user. And then when you call that you check to see if the token matches and if not you give some sort of error otherwise you continue with the request. This is actually a pretty common thing to overlook and it could be a serious vulnerability because most administrators can be tricked into clicking on something. You put a message on the talk page saying hey, this vandal is saying mean things about me on this other website. Can you do something about an administrator like ask to investigate to go to the other website? And if there's a CRSF vulnerability someone that application, that other website could do things on behalf of that administrator's account without really that administrator realizing. But when we check that a token that prevents it normally in your extension oftentimes it's better to use APIs that take care of this thing for you. In MediaWiki that would be the HTML form class which allows you to specify a form and as long as you specify it's a post method to use with the form it will automatically do that a token for you. Yeah, so yeah, at this point I, well A, I was wanting to kind of have my laptop so I could demo it which would be much, much more interesting. But I did want to kind of ask people if they had any particular questions about how this stuff works or anything. They wanted to know in particular or if they currently have an extension that they're working on and they're unsure of something. Okay then. Well at any case feel free to interrupt me at any time if you do. So the other, one of the other main sources of vulnerabilities is what's called an SQL injection which is very similar to cross-site scripting except it's with SQL. So for example a very common way to write SQL is to do something like this is called a DVR which is the Meteoropy Convention for our database objects the query method which just takes an SQL for example if you wanted to find the say, edit count of a specific user you might do something like this from, select the edit count from user where the user name is equal to whatever the name supply is, right? Sorry, I have to wait. So this might be, say you're working on an extension it's like a parser function and you want to be able to get the edit count of a specific user. So the main part of the extension is probably querying SQL and you might do the query, select user account from user, user edit count from user where user name equals dollar name and maybe you get dollar name from the user, right? From whoever is specified. So this would be an SQL injection because you're just directly putting the name in SQL. So for example, if someone asked on your wiki to say it's a parser function so something like number edit count so say that's how the parser function works that the user specifies something like that. So instead of doing a real user name they might do something like user name equals some value that's my user name and then an apostrophe an apostrophe to close the string because this gets substituted into here so if you add an apostrophe that closes the string and then you can add other things to the query. So for example you might do into outfile something like okay so if you add something like this you're now asking for that account of a user named Beowulf apostrophe into outfile slash fares slash www slash index or php so into outfile is a feature of MySQL where you could take your query and put it into a file so your SQL user would need file rights which it should not have but some people set up MySQL wrong. So in this case to truly exploit the you now created a file you've overrided index.php so you've now kind of totally destroyed the wiki. That's not the worst you could do though. You could also add something into outfile using delimiter and then make your delimiter something like contents something like that to load sorry that's not really very well but basically you could as your delimiter use a small php script to download a file from internet and execute it so from there someone could totally take over your server or other things you can do depending on how you configured you might not be able to do that but they might be able to get data out of the password table and find user passwords or whatever so to prevent this what we usually do is again we have to escape the input into the thing so instead we use something like the add quotes method of iDatabase and then that quotes the variable to escape all apostrophe or whatever and then it's safe even better than doing that though is to use media wiki's select wrappers to make your query look like something like this you divide it up into all the components and then media wiki puts it together for you so first from this table yeah so when you use the select method media wiki takes care of a lot of the escaping for you and it's easier to see the glance that escaped properly so this is a recommended way and it has a weird order where the table comes first then the fields and then the where clause where if you supply just a string you will do it on escape if it's an associative array any of the non-numeric keys become equalities and it automatically escapes the value so when you do that you can easily see other reviewers can easily see that escaped properly and it also is kind of more convenient to use yeah any questions about that yeah let's see what else can we talk about it's just kind of towards what I wanted to talk about I don't know if anyone has any other questions to ask or anything like that well here to me maybe JavaScript global wireless are they to be trusted I guess if the global context is compromised then I'm screwed anyway right yeah pretty much generally speaking you can trust the global variables in JavaScript if somebody is able to mess with them they can almost certainly execute code and like all is lost that said you should trust the ones you know what they are like if you don't recognize the variable then like you know but it's not like old versions of PHP with registered globals anything else sorry I don't think this was the most enthralling presentation with the whiteboard demo let's see what other things on a similar note to this if you're executing other files from PHP like if you're shelling out similarly there's an escape shellarg to use to make sure your shellarguments are escaped and someone can't add something to the shellargument to execute arbitrary code the general rule is if you're executing another language you always have to escape what you're putting in there that comes from the user is there any resource to read up on this yes we do have a page on mediowiki.org called security for developers I think or manual for security for developers and also like if you have concerns we're an IRC and you can ask on the mediowiki IRC channel and people should be able to help you wherever around and we're happy to help okay so I think that's I'm going to end the presentation there thanks Martin and of course if anyone has any questions later please feel free or anytime even after the conference please feel free to reach out with email or find me in person or IRC or our other two members of the security team Sam and John over there we're here to help make sure mediowiki's code base is secure so if anyone has any questions of security don't hesitate to ask yeah yeah if you're not sure but it's something yeah you can tag us in and we can look as long as it's like mediowiki related in some fashion and if you ever see something that you think is your vulnerability you can email us at securityoutwithmedia.org we always appreciate people who find vulnerabilities in mediowiki code or in extensions