 But I'll hang on to the microphone for Q and A if we actually get to that. Although, you know, I'm kind of at the stage where people ask questions, and I don't know whether I have any answers or not. By the way, this idea of breaking up into... I'm going to put you to your slide. Oh, all right, thank you. It's putting up into working groups and everything. Back in the day, before we had this sort of PowerPoint stuff and everything else, you had to have slides. And so the usual sequence was breaking out and throwing up slides, which sounded like a really bad disease. So I hope that we don't end up with that problem. Well, I'm here partly symbolically because I was here when we first started this project back 10 years ago. So one of the things that I thought would be useful is to just walk through a few reminders about why do we do this stuff in the first place? And the simple honest answer is that if you can't measure it, then you don't know how to analyze it, you don't know how to optimize it because you don't know, quantitatively speaking, how well or how poorly you've done. So measurement is really important. And that's why the measurement lab was started because we have this really complex thing called the internet. It's not controlled by any one entity. There are literally hundreds of thousands of networks that are part of the system. And so understanding it is not a question of understanding any one engineering team or any one company. It's trying to understand what's going on in this highly distributed and highly collaborative environment. So MLAM is partly there to help us understand this organism. Second thing is that by gathering this data, especially gathering it from many different viewpoints, we're starting to see almost like a tomographic system like a CT scan or an MRI. We are beginning to see what's going on in this complex organism as we can see beyond the edges. And that's part of the very important thing is to understand how the network functions not only from the standpoint of a user at the edge, but what's maybe going on inside. And there have been a number of insights that the MLAM has uncovered, which I hope others are going to talk about in more specifics, that have allowed us to see behaviors in the system, not only the technical and mechanical behaviors, but corporate behaviors about decisions of how much you're going to increase capacity or whether you're gonna try to force users to move in this direction or that direction by failing to add capacity where it's needed, or maybe even interfering with the way in which the system works at the edge in order to drive users in one direction or another. Let's, you know, does somebody hear net neutrality hiding in the background? So that's another reason for doing this. The third one, which I can barely see because of the reflections there, is that if we can gather this data as a time series, then we can actually see change. So it's not enough just to get a single data point to say the speed in the network is X. We're more interested in seeing how does the behavior change over time? And that is one of the reasons that we collect and accumulate the data over this now 10 years of time. The thing I've learned is that whenever I make a big mistake, it's because I made a bad assumption. And so although I still do that, I haven't been able to train myself to say don't make any assumptions. I can tell you don't make any assumptions. Don't assume that you know how this system works. Measure it to find out. So from my point of view, after you get the data, you have to ask yourself, how do I validate it? Sometimes that means having multiple means, multiple methods for measuring. Sometimes it means simply getting more than one party to carry out the measurements, maybe using different algorithms to evaluate and analyze, but you need some way of making sure that the data you got is valid. So single measurement does not do that. The thing I'm most happy about with regard to the MLab program is that the data is shareable. This is another one of those important points that some of us forget, and that is that we don't necessarily have all the answers, but if we provide the data that we have captured to others, they may see things or understand things or analyze things that we didn't get. This is true for a lot of other things, including design, but in particular, with regard to measurement data, having it available for other people turns out to be a really valuable contribution. And I think over the period of the last decade, a number of parties who sometimes weren't here at the beginning, but have joined the MLab program, have helped us understand this data in very, very, let's say, insightful ways. I think there's a big issue associated with collecting this data, and anyone who has not been sleeping under a rock will have noticed that privacy has become an extremely big issue in the last few years, in particular as you think about the GDPR from the EU and issues arising. Hi, Alan. Is anyone going to go on and borrow former colleagues from Google? The issue of privacy has become very, very important, and what is a little weird for me, I don't never think of IP addresses as being anything except just numbers, but some people are nervous that somehow whatever their behavior has been will be tied to a particular IP address, and that someone will then pull out of this mass of data, where did that IP address go, and for how long, and what did they do, what algorithms did they run, or what protocols did they run? So the privacy issue has to be addressed, and that means that the collection of data has to be protected, or at least anonymized, so that it isn't, there isn't possible to figure out whose data we're looking at. We just wanna see how the behavior of the system works. There are also some companies that would prefer not to have it known that their system doesn't work very well, and so that's another issue that has to be dealt with. Even though we're all engineers, scientists, and we're trying to be honest and frank about the collection of the data and its analysis, I guarantee that there will be parties at the table who will say, well, I don't mind if you show other people's data, but don't show any of mine. So being cognizant of that has turned out to be an important part of this program. Sometimes you can take kind of a big edge-like measurement. How, what kind of data rate am I getting at my house, for example? I won't tell you who my provider is because sometimes it sucks, but we'll leave it at that. The thing is, measuring just at the edge doesn't necessarily tell you why am I getting that particular number, and so finding a way to look more deeply into the system, especially on a time basis. So I see variations, significant variations in data rates at the edges of the net. The question is, why is that? Well, seeing more deeply beyond that gross measurement is part of what the MLab data can do. And finally, getting global data, like Akamai's quarterly reports and some of the other data that's accumulated under the MLab, gives us a picture of the internet on a global scale. And I find that to be equally important because watching the system evolve over time has been pretty fascinating, particularly as you look at the underlying infrastructure. 10 years ago, the access to the internet, any high-speed access anyway, tended to be dedicated. And now what's happened is that we're moving into the 4G and whatever 5G territory is, the data rates are going up for wireless. And the reason that's important is that wireless is becoming the most common access method for internet, whether it's Wi-Fi or LTE or even some of the 3G systems. But mobile has become a primary method of access, especially in the developing world, which is coming to the internet later than many of the other countries in the world. So understanding how the internet's behavior has changed as a function of the underlying infrastructure is equally important. It wouldn't be helpful if we did a bunch of measurements and then didn't keep the data because then we have no ability to look back to see what's changed and how is it different. But that also means hanging onto metadata. And I guarantee you that failure to keep track of the conditions under which you collected the data can leave you with a fairly awkward mess. A pile of numbers, the meaning of which is unknown. It's sort of like gathering scientific data. You make a scientific instrument that's measuring, let's say temperature. And so you have this great giant bag full of numbers, you know, 82, 71, 93, 104. And then somebody forgets to record the fact that this bag of numbers was temperature. So 50 years later, you have this bag of numbers. And you say, this data is available from the last 10 or 20 or 50 years. Somebody says, what does it mean? He says, I don't know, we didn't remember what it meant or whether the instruments were calibrated. So metadata turns out to be pretty damned important, especially if you're trying to assess the quality of the data that's been measured. The historical value is one that I'm very interested to explore for two different reasons. One is just seeing how the behavior of the system has changed over time, its capacity, its latencies, and so on. But I'm also very interested in being able to go back and formulate theories about why the behavior is as we see it. There are going to be cases, certainly in other scientific disciplines, where you gather a bunch of data. You had a theory that should tell you why the data is what it is. And later on, you discover the theory does not predict. Even backward looking, how the data appears. And so now you need to invent new theories. But now, in order to validate the theory, you'd like to go back and see how well does it match the data over a significant period of time. So I'm a big fan of being able to hang on to that historical data. Another thing, which I think is really interesting, is to observe significant changes in behavior, inflection points. I would predict, if we go back and look at the performance data for the internet all around the world, that we will see an inflection point probably in the 2007 period as the iPhone appears. And this rapid evolution during the 10-year period of everybody getting mobile phones and getting access to the internet that way instead of on dedicated lines with laptops and tablets. So looking for inflection points in the data is also a helpful way of discovering and going back and asking what changed. To what can we attribute that inflection point? Similarly, of course, if we see a significant drop in performance or disruption or something like that, we can ask what is it that led to that behavior. If we were looking, let's say, at data from Egypt, we would probably see some fairly strange behavior around the time of the Tahrir Square protests because they tried to turn the internet off. And generally speaking, succeeded in doing that for a few days anyway. And so looking for inflection points can be of historical use. Those of you who've ever looked at trend data from Google and you've seen the particular words being searched for intensely and you see these variations going on, you can often tie some of that variation to some event which took place. And so this is almost like counting tree rings in a way. It's like being able to look at the data and say, well, what events could I, to what event do I attribute this particular variation in performance? What I particularly am curious about is whether we can see consequences in the performance data of mergers, acquisitions, and disappearances of companies from the internet environment. When people say there's more competition or there's less competition as a result of mergers and acquisitions, I wonder whether the performance data tells us anything. If we see a diminution in competition or an asserted diminution in competition, can we see that in the data? Does it translate into poor quality performance? Does it turn into better quality performance because it's now aggregated under one authority and therefore more well engineered? I don't know, but I'm curious to see whether or not we can infer anything from the data along those lines. And finally, the thing that I find kind of fascinating, visual reality is a big deal these days. And being able to wander around in the data with your virtual headset on is weird, but sometimes pretty interesting. So I don't know that we've ever tried to do anything quite that kooky with the data that we've accumulated in the MLab, but the idea that you could wander through a multidimensional space in order to sort of experience the data in a more visceral way than you do by looking at tables might be an interesting experiment. I know that in other disciplines, a virtual reality presentation of data can be quite insightful. I saw some computational data for a rocket engine this is some time ago when I used to work on the F1 engines for the Apollo program probably a long time ago. We were trying to understand the kind of tension, pressures and tensions that took place inside the rocket engine as it was firing. And although you can take snapshots, do an analysis and take snapshot, snapshot, snapshot, what you don't see until you turn the thing into a movie is the kind of either static wave structures or buffeting that takes place inside the rocket engine shell until you turn it into a movie. And so I don't know, because I haven't seen any movies of the data from MLab, whether we would gain any insight from that, but I'm curious to see whether there are some dynamics that we wouldn't normally see just looking at statistical information, whether it's averages or distributions and things like that. So I'm almost done by the way, if you thought I was gonna go for half an hour, I'm surprised. You get to break out and do whatever you throw up or whatever it is you're gonna do sooner than you thought. I do wanna say though that this whole program deserves a lot of, people deserve a lot of credit, especially New America and the Open Technology Institute for starting this process up. And you can tell that it must have been useful or you wouldn't be here 10 years later. So that's a significant milestone, that's why we're celebrating this. Google of course was very much involved in the early days and so was Planet Lab. I have to say Planet Lab is aptly named because the guys that put that thing together were really thinking globally in terms of allowing people to try experiments out in the network. So if you wanna look at the sponsors and the partners, go to the website to do that. I do wanna draw attention to Casey Claffey because Casey is the doyen of network measurement as far as I can tell. I mean, she's been in this game forever with Kata at UCSD. She's outspoken and she's blunt. And she's the sort of scientist that you wanna hang around with because she doesn't pull any punches and she says, here's the data. I don't care what your opinion is, here's the data. So Casey gets my vote for doyen of the measurement world. And finally, people like you and others who take advantage of the data that's been collected are helping us understand more deeply this complex phenomenon called the internet. So I really appreciate that. I appreciate the opportunity to speak briefly this morning but now I'm gonna return you to your originally scheduled program. I'm happy to spend time on Q&A. I'm sure you may have questions. I have no idea if I have any answers. But I'm happy to spend time on that. If you wish otherwise, it's back to you, Peter. So, any questions? Comments? Rants? Okay, so I get one rant and what? Oh, I'm sorry, Peter. All right, Peter. And you're number two. And by the way, I like the vest. That's good. Where's the jacket? Okay, Peter. So what has been the most surprising thing to you about how the internet has evolved over time? Well, you know, the honest answer is not down, wow, that's a, the problem is rank ordering most surprising, okay? So if you allow me to respond at different layers in the architecture, then I can do a better job. So that means I have at least seven answers, right? Let me start out with at the application space level. Frankly, I think the World Wide Web really surprised me. Not because of the technology, which is extraordinary. It's the vivid desire people had to share what they knew. And so this avalanche of content flows into the net in consequence of the ease of doing that. And so that really surprised me. I did not anticipate that. I think at the sort of the transport layer, the transmission layer, we were certainly gasping for air in the 1990s because it was a dial-up network. And there were 8,000 internet service providers in the mid-90s, it was all dial-up. The nice thing about that, of course, was if you wanted to change providers, you dialed a different number. So competition was strong and you could switch it. Well, along comes broadband and there's a collapse of competition because it's not available from more than a few entities, cable companies and telephone companies, whether it's Fiber or cable or a digital subscriber loop or something. So suddenly there's a collapse of competition, but at the same time, there's this significant increase in bandwidth. And so that was one of those plus and minus points in the network's environment. But the increase in capacity led to a whole bunch of applications that would not have worked on dial-up, in particular streaming video and interactive kinds of low-latency games and all those other things. So that was kind of a nice surprise to see the technology flow in despite some of the other consequences. Of course, I have already mentioned the smartphone in 2007. I'm still having trouble believing that that's 11 years ago. It's only 11 years ago. Billions of these devices have been built. The applications on them are uncountable and we rely on them incredibly heavily at this point. Suspicion is that a lot of people in this part of the 21st century are unable to pick up a paper map and understand what it means because they're waiting to be told where to turn. And of course, some people are so accustomed to turning wherever they're told to turn that they turn as soon as it says turn, even if it's into this guy's lamppost. So this is sort of like worrying that people will never remember anything because we've invented writing. And so there are objections to writing because of this that we'll never remember anything. I don't know what the consequences are or heavy dependence on the network and these devices, but that's a big change in just a decade's time. Let's see, other surprises. I think that the impact of social networking is also a big surprise, especially, it shouldn't have been a surprise, but it was anyway, and that's the degree to which social networking has a negative character to it. Shakespeare is still relevant 400 years later because he tells us about the foibles and motivations of people good and bad and it's still the same. It hasn't changed. That's why we still watch Shakespeare plays. But the negative consequences of social networking are becoming increasingly visible. The asymmetry of the ability of someone to attack at scale in the online environment is another surprise in some sense. I mean, yes, intellectually you can kind of understand that, but until you actually see the effect, it's visceral. And so we're seeing, well, service attacks, we're seeing malware, we're seeing all kinds of other hacks going on, and that's distressing. I think that at some point we are going to have to come to grips with what I'll call the pacification of the internet in order to make it a safer environment. So that's sort of a rapid, not rapid enough, probably, reaction to your question, Peter. Okay, so it's you, do you want to say who you are? I'm Greg Russell. Hi, Greg. And I work on the engineering team on them. I seem to have just picked two Google guys. No. And so that, you know. Did somebody else put their hand up that you want to? No, there's one over there and there's one over here too, but go ahead. As somebody working in the engineering team, I always want to know what should we be doing in the engineering of MLab that we're not doing now? So I was curious what you would like to see this community doing over the next three years or so, and what does MLab need to do differently in order to enable that, or participate in that? Okay, so this is one of those questions where I'm as baffled as anybody else is what to do next. I'll tell you what, I would like to see, and I don't quite know the best way to do it, but I would like to find a way to allow ourselves to measure performance on a regular basis from all devices connected and still protect people's privacy. But in order to get a better, more quantized sense of how the network is behaving. At Google, we have a very large collection of networks, and we have one network that connects all the data centers together, we have networks that interface to the public internet, and I sit on one of the distribution lists when things go wrong, and things do go wrong. Things break, malware, it's not even malware, sometimes it's just a bug that somebody introduced and didn't detect. So from the measurement point of view, I would like to find a way to more naturally capture the data, that right now has to be voluntary, somebody has to decide they want to make a measurement or contribute their data. If there were a way to do that, so that people's privacy could be satisfactorily protected, and yet we could see more in a more quantified way how the system is performing, I'd be really interested in seeing that happen. The visualization stuff is the other part that I would really like to see more of. Then the only other question I think might be figuring out what new media may come along to carry internet packets. What's astonishing is that anything that'll move a bit from point A to point B in theory is adequate to carry an IP packet, because we don't ask very much, right? We just say, please deliver this packet with some probability greater than zero. That's all we ever ask. And then we try to cover up the fact that sometimes the probability is zero, that the packet's gonna show up and they have to retransmit. So figuring out ways to anticipate some of these new behaviors, think about what will happen as we move into, whatever 5G is, a lot of it's higher frequency and it's smaller cell size. And so we will have a lot more nano cell or micro cell behavior. I don't know what all the consequences of that are gonna be. Handoff, for example, is gonna be at a higher rate in mobile because of the smaller cell sizes. So seeing, trying to anticipate what that's gonna look like would be helpful if we could model it. Okay, so I had two other hands who I have. There's one over there and you were here so that's, you're three and four different. No, no, no, we're recording and you want your words to be kept for the ages. So who are you? I'm Larry Peterson. Hi, Larry. I know that, but I wanted you to say that. Yeah, Larry Peterson, Princeton. This is Mr. Planet Lab. So I'm thinking back to 10 years ago, I don't know if you remember this, we launched the measurement lab and we had a little event kind of like this and we were telling about all the wonderful things that were gonna happen. And as soon as we got done talking, right in the front row, the representatives from Verizon and Comcast at AT&T stood up. And last. So we started challenging and started challenging that we could actually collect valid data and come to any valid conclusions. So I'm just gonna put you on the spot. Oh, okay. Just reflecting on the last 10 years, the role that the ISPs, and clearly they're an important part of the internet, not originally, but they've become a very important part of the internet. Just your reflections on the role they've played and how they've evolved and grown into that role. Okay. Well, first of all, regardless of your experience with ISPs, you have to admit that they are investing a boatload of money in infrastructure. Sometimes we will say not enough and sometimes we'll say you're collecting too much from me for the quality of service that I get. But I mean, in all honesty, there's a lot of money being spent there and so they should get some credit for that. I think, however, that the notion of customer service is kind of lost on many of them. And so that's a place, it has nothing to do with technology exactly. It has a lot to do with recognizing how much we depend on those infrastructure providers for our daily lives. It's scary sometimes. So there's that aspect. I wish that we could see an increasing amount of cooperation among those parties with regard to some of the, let's say, the vulnerabilities in the system. This is not as secure a system as we all, I think, would like. And we need more cooperation among the parties who are providing the underlying infrastructure in order to defend the system against some of its weaknesses. We also need the engineers to start thinking about how to make the system a lot more resilient and a lot more resistant to attack. And I won't go into the long rant about better programming tools to avoid making stupid mistakes like off by one or reference to a variable that didn't get set so you get a random result or creating buffer overflows, which seems to be something everybody can do in his sleep and then that's the problem. So getting better tools would be a big help. Other than that, there is one other thing which has been resisted by some of the larger carriers and that's municipal networks. And although I accept that there have been experiments in municipal networks that have failed because the economics didn't work or it wasn't maintained properly, I still think that people in the rural parts of the country deserve better than they have been getting. And I spent some time in the last five years or so focusing on Native American populations and their isolation from the internet, but I was just in Placerville, which is outside of Sacramento, and some of the people there are still dependent on satellite access in order to get to the internet because there isn't any cable. Now there is something happening right now, this is back to your question too, that I can't predict how it's going to come out, but a lot of satellite stuff at lower orbit is starting to emerge. There's 03B at 8,000 kilometers, they've done pretty well. I don't know from the business point of view, but I actually dropped one of their ground stations into a rural part of Brazil at an internet governance meeting for about a week and we were getting close to 800 megabits a second up and down. We even did a live interview with Eric Schmidt from Brazil. He was in New York at the time. And so for $200,000, you could have a ground station there in theory, at a gigabit per second up and down. And it's only about say 170 milliseconds round trip time. It's really, it's 50, physics is about 50 at 8,000 kilometers, but there's some other stuff. Now if you look at what some of the other Elon Musk wants to put up some 11,000 satellites in two different layers, holy moly, they're gonna be falling out of the sky before they get all of them up. Of course that's good for his SpaceX business because they just have to keep launching satellites. We've just announced the Alphabet, the Loom project is spun out now as its own company. These are the balloons at 60,000 feet that deliver up to a gigabit of capacity from the balloon to the ground. And there's an inter balloon system. There's even a routing system called Minkowski, which for obvious reasons, for the mathematicians in the group, which is inter balloon. The thing that's amazing about that particular design is that the balloons are at 60,000 feet, but they go up and down. And if they're trying to get to a service point somewhere, we look for a tailwind to blow them in the direction of the service point. And then when they get to the service point, they look for headwinds so they can loiter. And they actually built a routing system that takes this into account. If they come into my office and said, this is how we're gonna make this work, I would have thrown them out of the office. So it shows you what I know. So I think we're going to see significant, if these satellite systems get up, they were talking about 175 kilometer, 200 kilometer, 1,000 kilometer orbits, the latencies are very low. Coverage is extraordinarily high with those number of satellites. So we're gonna have to learn how to measure that too. And it will be really interesting to be prepared to measure it so we can see what kind of an inflection point we encountered as those systems become available. I'm gonna be in Vanuatu next week. This is an island in the middle of the Pacific, somewhat east of Sydney. And the islands have been isolated for as long as they've ever been around, except that now we're starting to see undersea cable connecting a whole bunch of the islands in the Pacific. I never would have won a bet. I would have lost money on a bet about whether cable would ever get to most of those islands. So that's yet another transformation that's going on. So that's why the measurement is so important is because stuff is happening that's gonna change access to the network and its underlying infrastructure. And I wanna see that in the data. Okay, so that's the best I can do, Larry, but thank you for the opportunity. There you go. It's over there. You were nodding your head. I thought, sure, you were, okay, go for it. Right, thank you. Who are you? You have to check? Yeah, just to make sure. So my name is Josiah Javola. I'm from Malawi, but I'm based in South Africa. So a comment and then a question. So it's almost surreal that I'm in the same room with himself. I'm sorry to interrupt you, but I have to tell you one time I was in a Q&A session. And somebody came over and said, are you real? Oh, yeah. You know, they poke me, you know. Can I try that? Can I, yeah, just check. You're amazed that I'm still alive. All right. That wasn't the point. But the point is, as a networking student in Malawi, a few years ago, I'm just reading about you and the work that you did. I never imagined that at some point I'd be in the same place and talking to you. These were figures that you thought were, scientific figures that you learn about, but you'd never really get to meet them because they are a different breed that live elsewhere. And now you're disappointed. Yeah, so I'm quite happy. You look better on the book. Yeah. I'll probably take a picture with you and send you to my former lecturers and colleagues to just show that, yeah. Well, thank you for that. But my question is, so I'm based in Africa and my main interest is measuring internet performance in Africa, where generally there are a lot of challenges in terms of performance if you are to compare with the rest of the internet world. And a lot of people, including myself, do recognize the importance of measurements and how measurement data can influence policies, but also influence how operators interconnect and improve their services. But the challenge you've had so far is really getting the measurement infrastructure to scale, to get it into all the right places so that you have enough vantage points to measure and really prove points for many of the challenges. How do you think we should get to a point where it is a lot easier to get scalable measurement infrastructure? For example, would it be imaginable to think that maybe in future measurement tools and availability would be default on network devices and internet clients? Would that be something that could happen in the future where every device, every networking component basically has a feature in it which allows it to almost automatically measure and share anonymized data so that we are able to understand from whatever point of view without having to deal with these issues of having to deploy lots of devices where it's really hard? I'll discuss more on that maybe in breakout session tomorrow, but these are really challenges that we have in terms of getting the measurement infrastructure in place to measure and prove cases where we see a lot of suboptimal performance of the internet. So actually, there are two kinds of answers. Let me wander over here so I'm not speaking to you from behind. Two kinds of answers to that question. One of them is technical and that's where you were headed, having devices that are sort of self measuring is a normal part of the design. And I rather like that a lot when the original ARPA network was done, we didn't quite know how it was gonna behave at all. So there was a very significant amount of work put into measurement in each of the interface message processors, the IMS. And it was a good thing because our first routing algorithms didn't work very well and Bob Khan and I knocked the network down regularly to show that it would congest and things like that. So that's one avenue which is to ask the designers to think about measurement even if it never gets turned on just for a moment, the idea that you put that into the design I think is a good one. But the problem is getting that stuff turned on pretending it exists now it's a policy issue. And it's got to be I think a regulatory one. I don't see any other way around it. The people who supply the services need to have their feet held to the fire on a regulatory basis that you have to show transparency of the performance of your system. You can make all the claims you want in your advertising and everything else but you got to show me the numbers. And we need to have the ability to independently evaluate the numbers. It can't be that the company that provides the service simply asserts we measured everything and it's wonderful. And to be perhaps controversial about it even our regulatory agencies are quoting numbers which I don't necessarily believe are accurate with regard to the performance of broadband or the stretch and reachability of broadband which is why I'm a big proponent of going out and doing real measurement on an independent basis. That's why MLAM is such a valuable activity. But I think that you will not succeed in your objective unless we have real regulatory authority to say transparency is required if you're going to offer and make claims about your service. Okay, we're probably way over time now. Let's take one more. We got one more. Okay. Who are you? I am Michael Bavzoniak, Princeton Planet Lab. So I have a speculative internet architecture type of a question for you. Seems appropriate. That should be fun. So measurement lab, we're measuring internet. Then there's analysis consumed by humans. Policy might be affected by it. Sort of natural extension of this is to think about, well, what if the analysis was also done in real time? What if that was a publicly shared available analysis? What if the control was also derived from it in real time? What if measurement lab was a step towards a logically centralized global planetary control plane that is shared by others, et cetera, et cetera? Okay. Do you know where this is? Well, yeah, except I don't think it scales. I mean, if we were in a bar somewhere having a drink, I would call you an open flow freak or something like that. Which, by the way, it works actually pretty well. But in a, let's say, a constrained setting where centralization is not very good, setting where centralization works, and also the data is fresh. I mean, one of the big problems is the freshness of the data. It's logic. Well, if you're gonna make decisions about routing by gathering data centrally and then making those decisions, then you do run into the problem of freshness of the data and scaling. So, but you're, I'll tell you what, if you believe that you can make something like that work on a decentralized basis, then the decision making, which is localized now, because you said it was distributed, now raises the question of the stability of the choices that are being made because each one of the local decisions is bereft of knowledge of the state of the network that's far away from that decision. And so making that coherent is not so clear to me. You want to defend yourself? I'm not saying this is, I'm not saying I have a solution to this, yeah. Let me, in an environment where you have predictable flows, statistically predictable flows, this kind of thing can work because you don't have the kind of variations that cause trouble in dynamic decision making. So in some of the, in our network, for example, because of the scale, the flows between the data centers are actually fairly predictable and so we can use an open flow technique. Also the size of the number of different facilities is relatively modest. So that seems to work really well. We get excellent results from that or inside the data center, which is localized. But if we're talking about the general internet, it's really diverse. The data rates at the edge is very dramatically. So I'm not so sure whether you'll have a stable result. It will be interesting to try to simulate that to see whether or not it makes sense. You want to respond to this? It's totally different. Okay. No, use the microphone. Who are you? I'm Ken Baiba. I'm from Novarum. In fact, 30, 40 years ago, I was in the internet committee, was the junior guy in the back of the room. Back then, right. And we're working with the California Public Utilities Commission. I don't know if it's interesting stuff to talk about. But Mike, I just did a study inside of San Francisco and discovered that there's about one Wi-Fi access point per person in San Francisco. That's amazing. That's an issue of scale. There you go. Okay. I think we're done. But thank you very much. I appreciate it. Thank you so much for that. Yeah, thank you again. So we're just gonna do a couple quick minutes to build off of what Ben was talking about and reflect a little bit on the last 10 years and then catch us up in time for a break before the first round of lightning talks. So in case you forgot. I'm still Peter Booth. I'm still Georgia Bullen. So why are we here? So we're here in part because Measurement Lab was not guaranteed to work. Like, well, let's just measure the data, share it with the world. People will care, maybe. We'll be able to do it. We'll be able to handle all the data and just share it with the world, maybe. There were a lot of open questions here. And it worked, or it's worked for a decade. That's great. So we should celebrate that. That's really cool. And so that's part of why we're here. And also, well, we have a community of people here. That's why we've lasted. Yeah, I think a big portion of why we wanted to have this event was to get many of you in the room who've never met each other. Maybe you've read each other's papers, maybe you've read each other's blog posts, talked on a video conference over this wonderful thing that we measure. But maybe you've never been in the same room. So a lot of this is really about getting you here together to meet each other. And while you're talking to each other, one thing that is kind of interesting in these scale-free contexts, how long something has lasted is the best indicator we have of how long it might last. Which means that we need to plan for the future because we might have one. So identifying future needs amongst all of you is a critical portion of why we're here. We have 10 years of data. You know this, but let's just go through it really quick. Our current daily volume, we get two million NDT tests every day worldwide. It's super great. We're plumbing the unit. Thanks. And that's not counting our side stream efforts where we've instrumented every TCP flow into the platform. That's not counting our Pracerot. That's not counting Diff detect. That's just from one experiment. In 2009, we had our first NDT test. By 2012, we had 200 million in total. And in 2017, we celebrated having a billion rows of open speed test data. It's super, super cool. And if you go to the next slide, you can see that actually in 2014, when I joined the project, it was a little bit of a scary time in terms of our data. We founded the project and then every year, we got less and less data. You know why? Well, it happened because, well, I had a hypothesis. So in 2014, the hypothesis was, our tools were too hard to use and inaccessible to modern web browsers. So I said, okay, please, please boss, may I implement, make the NDT tool work with modern web browsers by adding web socket support and adding TLS support to it. And I did that, took a long time because it's a little bit of a hairy code base. I think you called it a quick hack. I called it a quick hack. And it took me a year and a half. So there's that. And so it was still happening on it in 2015. But once we got web socket support, a bunch of new websites came online. And then once we got web socket and TLS support, we threw a series of coincidences that is almost unbelievable at this point. We developed a partnership with a team inside Google called the Onebox team that suspected. There was user demand for, if you type how fast is my internet, to just click a button right there on the search result page. And so, if you type that right now, you can do that. We're just here from that team. So if you have any questions about that, there's a good person to ask. And because of that partnership and other integrations of NDT throughout that, we saw this ever declining trend turn around dramatically. It's really dramatic. You can see we definitely weren't getting two million a day in 2015. But now this year we're on track to get 700 million speed tests in our database. We're getting an unprecedented view. And we're getting an unprecedented view worldwide. But let's talk a little bit about the history of the data. And I'll turn it over to Georgia. So the other thing that that's enabled is allowing us to get a bigger diversity of connections. Initially, by not having modern web browser support, JavaScript support, most of the tests that we were seeing were coming from actual Java integrations, C-client integrations, things like that. And so we were only measuring some of the internet. And this graphic, which is on t-shirts, which if you haven't gotten one, you can take one there outside. We made them for you, so please take one. It's not just this album cover that you've maybe seen on Tumblr. This is actually MLab data, so chunked by six months and showing the median download shift over time. I made this in February, so actually the six months in 2018 that's in here is just through February to get a sense for this. But this is actually the moving median speed of what we're measuring in the data. So please take a t-shirt. You can get one at the break, but we'll have in a few minutes. It's super great. We can tell a story of the past 10 years of the internet getting better, and we can tell it from data. You can see it used to be not as good, and it is getting better. That's great. We also have 10 years of community, so MLab doesn't happen alone. There's the core team. People who are on the core team that work on this every day can raise their hands. You can get a sense of them in their room. There's a bunch of us here. Yeah, that's good. But actually MLab survives on contributions from the entire community, and that's everyone from the governments and regulators that we work with to researchers writing about the data, which I think probably covers many of you in the room, academics who are developing the experiments that we host, some of which you can see here with the partnership institutions that we work with, and particularly actually most of our, many of our sites, especially our early ones where we host servers, were donated space from community partners around the world. So just to get a sense of where some of our first major installs were, this gives you like every time we added a continent to the platform's footprint over the last 10 years. So there's been lots more though. This was just some of our key first sites. Today we have over 500 servers in over 130 locations. And again, that's a lot through partnerships from connections that you all have helped us make, people who have come to us. Last week we got an outreach from someone in Pakistan who wants to host a server. So that is our challenge for the next couple of months to get some machines there and get that set up. Once I respond to their email, which I probably owe many of you as well. But we'll do that and then we'll make that happen. And that'll be great. The different colors. So the different colors are sort of like number of services that are up at each of those sites. This map is derived from our monitoring system. Yes. And our load balancing system. So it's just a snapshot of server health at one particular time. So the three red ones aren't actually bad ones. Those are currently cloud servers that we are not collecting data from, which is why they're red, but they show up on the map. So they're there to handle extra load because we have had lots of people running tests and we want to make sure that they get good measurements. But those are actually a pilot of a cloud server, which you can hear about more in some of the breakouts. But yeah, those are, I screenshotted that last week for the presentation. So give you a sense of some of who those partners are. This is sort of what you can see more of on our who page on the website, but just to give you a sense and to acknowledge and say thank you to all of our amazing site partners who help us host servers and collect data so that we can give people better measurements around the world. So that question for the two days is what's next? So a little known fact is that MLabs, while MLabs platform has been open source 100% since the day it was founded, we used to process our data by having a job which copied all the data into Google internal systems, processed it internally using closed source code, and then spat the analysis back out to the world. And that's how we existed for a long time. This wasn't, eventually we sort of reached a social crisis and we realized this is not in keeping with the spirit of the project. So we, in 2017, we finished an engineering effort. I say we, but Greg gets a lot of credit here. And now we are 100% open source. You can track every bit from collection to initial storage, to subsequent processing, to insertion into the database, insertion into a visualization. There is no secret sauce anymore. And we're really kind of proud of that. So we have a new data pipeline. We're building a new platform and we're monitoring everything in order to make the platform observable, which is a really cool thing. We've been calling it MLab 2.0. I found out last week we were calling it this. This is kind of interesting to me. If you write it on the slides, then it's true. I was told. But the shortest version of what MLab 2.0 is, is experiments in Docker containers and everything managed by Kubernetes and everything monitored by the Meteos. MLab's mission is and remains to measure the internet, save the data and make that data universally accessible and useful and our architecture mirrors that where we measure the internet, we save the data and then we push it through visualizations to make it universally accessible and useful. So it's kind of nice to have an architecture that mirrors your mission instead of just your org chart. So that brings us back to the community and what we're here to do for the next two days. So my charge to all of you is to make sure that you talk to your neighbors at your tables and make connections so that there might be projects that come out of this or brainstorms or ideas that you can follow up on or you can give us to do. We'll take homework because that's part of why we're here. Attend the sessions and ask critical questions. We'll find out why they did something a certain way or how they did it or think about how you might have done it differently. And then bring your ideas about all of those future measurement needs, future expansion ideas and future projects to the community and let's see what we can do. So thank you for joining us. I'm really excited about this. It's amazing to have you all in one room and let's start the break by meeting people. Yeah. Okay, do we have time to just move? Yeah. I have two questions. Okay. Oh, yeah. I can just come along and answer them. So question I have. First one has to do with reconciling our data with other data that might be available. I'm thinking of Akamai in particular because of the number of test points they have, tens of thousands of them as near as I can tell. I don't know whether their data is in a form which is reconciliable with ours. So I raise that as a more general question and that is standardizing the format of the data and metadata so that people who are accumulating information can contribute it to a global pool whether it comes out of MLAM or not. But that means we need to publish what the formats are, use schema.org and so on and make sure that we get calibration and metadata as well as the raw numbers in order to make them reconciliable and comparable. But the specific question is whether anything that Akamai currently publishes can be reconciled with either validated, supported or even perhaps invalidated by our data. Look, neither of us have done that specific homework. Yeah, I now have a homework assignment. I think that's a big question that a lot of us are working on. In the US there's been a lot of conversation about how to take the various public data sets that government agencies provide and how to map those to MLAM data that's something that we're working on in a couple of contexts and there'll be presentations on in the next session and later today. So I think that's one place where we've definitely started having that conversation. Another which there's a session on tomorrow that Anchi is leading, if Anchi wants to raise her hand with one of her colleagues who's based in Ethiopia is looking actually at internet shutdown specifically and that's a place where the companies and a lot of the open data sets have been trying to figure out how we can alert when we see something happening, what might be signals of when we see networks going down in certain parts of the world and how we can work together better around different open data sources. So I expect and hope in a lot of places like Akamai is a good partner there, Oracle Dine has been a partner there and that's places where we're starting to look at questions like that. Sure. Anybody else? If not, start this break by meeting someone at your table and getting coffee and then we'll be back here in about 10 minutes to kick off the first round of breakouts. Thank you. Thank you.