 So this is going to be pretty quick. Who I am, where you can find me, performance team, et cetera. We're fairly easy to track down. So why do we worry about performance? Performance site equals happy users or helps lead to happy users in general. That seems fairly straightforward and fairly obvious. There's another somewhat more subtle reason why we worry about performance. And that's because there is a lot of evidence that the performance of a site actually directly impacts how much trust people have in the authority of a site. There's a reason that, for example, page performance is actually included in Google's PageRank algorithm and so on. It's one of those really subtle psychological factors that can have a significant impact on how people view the information that they're receiving or whether they trust the information that they're receiving. How do we actually measure page performance? So we have two broad mechanisms by which we measure the performance of the media wiki itself, the wiki media sites. So real user monitoring is a phrase that covers everything that we do in order to collect data from actual users of the site. In our case, we have a media wiki extension that is called navigation timing. We currently sample one out of every 1,000 page loads on the WMF sites. We collect that data. It ends up being 100 data points per second, roughly. We send all that information back to, we use the event logging framework to send it back to us. We aggregate it, and then we track those data over time. You can, if you pull up DevTools in your browser, I've got this on the footnote here, you can actually see the data that we collect. It's part of the window.performance object within the JavaScript console. There's a fair amount of it. It's supported by basically every modern browser. We track those data over time. There's a link here to the... The great part about real user monitoring is that it shows us what actual users are seeing, what the actual experienced, or the actual sort of lived experience of a WMF site is for users. The downside is that there are a huge number of confounding factors. The client device, the client geography, the client connectivity. If two ISPs get in a peering conflict with each other, if global DNS services are slow, there's all sorts of different things that can impact this that are entirely outside of our control. The most simple example, you'll see the response start graph that I have on this page, and you'll notice that it has a fairly significant sine wave effect with a median response start that is as low as about 250 milliseconds and as high as about 375 milliseconds. So 50% higher. You will notice that the peaks on that are roughly 8 AM UTC, which is to say at a time when the majority of our traffic or a significantly higher portion of our traffic is coming from Europe, from Asia, from Africa. Meanwhile, the valleys, the times when we have the fastest response start are times when our peak traffic is primarily coming from the United States. So you can see just from that we've got a 50% swing on a day-to-day basis in the perceived performance, at least with respect to this one particular metric. The other type of monitoring that we use is something called synthetic monitoring. This is where we actually have a tightly controlled environment. We take the same measurement over and over and over again, and we can track the changes in that measurement over time. There's an application called web page test, which is the primary thing that we use to do this. It actually allows you to record page loads, repeated page loads. So we might be loading, for example, the Barack Obama page is one that we use as a sample. Excuse me. So we will load that 10 times in a row, take the average of those page loads, and use that as the sample for that minute or that 10-minute period. This allows us to identify regressions or improvements to our code because the variability of those other external factors is much lower. We factor out things like network variations, we factor out things like resource competition on the client device, and so on and so forth, and we can do a lot of work because we actually control the environment that we're taking the measurement from. We can actually do a lot to eliminate even more of those factors. We end up actually having a variance of like 1% to 2%, in page load time, which is actually an incredibly, incredibly tight sort of level of sample error. The con of synthetic monitoring is that it doesn't give us a quote-unquote real measure of performance. It's telling us what, it's something that we can compare over time, but we cannot compare externally. We don't really know, we can't say that because synthetic monitoring has gotten 5% longer, that doesn't necessarily correspond to page loads taking 5% longer in the real world. Point being here, the Raman synthetic are measuring slightly different, slightly different things. So somewhat, yep, go ahead. Sorry, can I interrupt for a sec? Just on the real user monitoring, do we have the global set of measurements? What do you mean by global set? Like other geos? Yes. Cool. So we collect that data, that the samples that we are collecting are not, we don't filter those samples based on geography or anything else. So it really is just one out of every thousand requests, random, no matter where they are in the world. Something that we're actually gonna be rolling out in about a two week or roughly two weeks, like right after the next sort of release freeze ends or the current release freeze ends, is some, excuse me, some enhancements to the navigation timing extension that will actually allow us to over sample based on geography. So we can actually say that this is something that we've been doing in preparation for Singapore going live at the end of January. So we wanna be able to say that if someone is located in Malaysia, Singapore, China, et cetera, the geographies that are most likely to be directed to that data center, we actually, for those geographies specifically, we wanna be sampling at a far higher rate than one out of over a thousand users because we wanna be able to have a much bigger data set when that data center actually does go live. Does that answer the question? Yeah, sort of, but do we have the ability to look at like, can we see like page requests from India, for example, or from Nigeria? So yes and no, we don't have that in the aggregated data. We do have that in the event log stream. So we would have to, we would need to go and query that out of the event logging system. But we do have that, each data point that we get is tagged with the geo that it comes from. Okay, that's useful. We may wanna have Tillman take a look at that or Tillman's busy, because I think one of the things that we're very interested in is seeing if this new data center and the hopeful speed increase it will provide to people in that part of the world, like will that have an effect on consumption, right? Yeah, yep. Really interested to understand that. Absolutely, and that's exactly why we are adding or what we have sort of made the enhancements of navigation, the navigation timing extension that we have. We have, you know, right, five data points per minute from a particular relevant geography depending on time of day, we would love to be able to get 500 data points a minute for over a couple of week periods so that we can really get a very detailed sense of that. But yeah, we have that data, or at least we have it and we can sort of compile that data in useful ways, I think. In terms of the actual metrics that we're measuring for real user monitoring, there's five basic ones that we look at. Response start is measuring basically the performance of our infrastructure. So that's the delay between the start of the request and when data starts coming back from us. Dom interactive is the time that it takes to actually parse the HTML page and create the initial document object model. First paint is when that initial document object model is actually displayed on the screen. Load event end is basically when the onload complete event fires. And then save timing is the, that's actually on rights, that's on page edit. This records the amount of time that it takes to save that page edit. For synthetic monitoring, we look at start render. So that's basically when the page content actually begins to be rendered on the screen. Fully loaded, which is the time that it takes for the page to completely render. This should generally be similar to load event end. We also look at last visual change, which may sound like the same thing, but this actually takes into account the fact that you can modify the page after the page load is complete. By attaching a JavaScript event to the onload complete event, if you want to delay a loading of certain elements on the page, if you wanna delay a loading of certain images, if you wanna load a very low resolution image initially and then replace it with a higher resolution image after load complete, any of those things. So those are all gonna change the visual on the page and the last visual change measure actually won't happen until all of that stuff is done. And then the final thing that we look at is what's called speed index, which is a measure of perceived performance that is based on the percentage of the viewport that is rendered at any given time through the page load process. I could give an exhaustive description of speed index. I really recommend checking out the description from the webpage test site. It's pretty fascinating though and the way that they actually calculate that is pretty tricky and pretty neat. A few things that we have coming up over the next one to two quarters. We've got the, that we're just finishing up that are going to give a really detailed introduction to performance testing and sort of covering a lot of what I'm covering here but in much, much deeper detail. A self-paced training module on performance analysis. So that'll be something that I think we're probably gonna end up shipping as a Docker container or a Vigrant box that will actually walk you through sort of all the different tools that are available to you and the different sorts of common performance issues that you're likely to see and what those will actually look like in within the browser or within the dev tools. Performance testing as a service. So I mentioned webpage test. We have that up and running right now. It's managed by us. We're testing a relatively limited number of pages. We are not trying to do a sort of generalized performance test right now. What we would like to do is actually make that available to two individual teams where you can tie it even into your CI workflow in order to really get a sense of performance changes over time as you are modifying your own code. A literature review on existing research on performance. It may sound like sort of a funny thing, especially to be listened so far down. One of the interesting things about the performance world is that there's actually, there are a lot of articles of faith that have sort of disparate scattered research that back them up. There hasn't been much done to actually pull that research together and come up with anything that's particularly sort of coherent and easy to distribute. So we're actually going to put some time into that. And then finally, we're looking at collecting resource timing on slow page load. So the idea there is to actually collect information on which individual elements of a page caused a page to load slowly. There's some support starting to roll out in both Chrome and Firefox for collecting that information. We're hoping by the middle of this coming year that we'll be able to have that out there. Couple of gaps, couple of things that we know are issues for us right now. Mobile apps, we really don't have good metrics for performance of mobile apps. And we don't have generalized tool sets in the way that we do for the browser side. That's obviously something that is relevant to readers. It's something that we know that we need to invest some time in at this point. We don't have a specific plan to do that and we don't have a specific timeframe to do that. And we would certainly love to work with you guys to figure out what the right approach to that is. We know we have a gap with older browsers which don't support the navigation timing API. Chrome has had the navigation timing API for a very long time. There are a lot of browsers that haven't. And in particular, there are a lot of mobile browsers that haven't. When we're talking about, this means that our samples, our real user samples are by necessity skewed a little bit towards newer browsers and newer devices which also means that we are skewed a little bit more towards the global north.