 Okay, I'm going to talk in the breakout session about a paper that I wrote and just last week found out didn't make it into the publication I was submitted to I'm blocking IMC IMC and apparently IMC has got this huge rejection ratio I wanted to show one picture out of this paper and I'm going to talk a little bit about how it was generated and so on What we noticed is that in the MLab data in this 1.4 billion rows Substantial fraction of it something like 17% of it is from a pool of One and a half million say beacons. It depends on exactly how you qualify beacons these beacons run repeated tests for whatever reason and Because they run they're the same device is running tests over and over again They provide results that are comparable to themselves. And so this is a simple simple plot of 10 years nearly 10 years worth of data in Europe showing a gradual improvement of the European Internet This shows it roughly a factor for improvement in typical performances for the beacons And the way that I should describe what the graphs are the graph on the left is Account of the number of tests per week from the pool of beacons at each of Bunch of different performance levels The graph on the right is the same data normalized such that the the number of tests is that is at the top And so it becomes percentages So The cool thing about one of the cool things about that graph is first of all I Invented a technique for using it which just seemed like something that was throw away because it was very easy to do It turned out to be very very efficient that graph took about 40 seconds to generate From 1.4 billion rows Big quarry is a big deal the other thing is it turns out that the Because of the way it was generated the data is actually linear in a in a in an algebraic sense you can do transformations in the data that you can't do with other metrics and I want to talk about that a bunch in my presentation and I don't know a bunch of other things We'll edit that You have to go to Matt session to ask him questions, that's what I'm gonna do Reza, so Matt will be Hi, thank you very much and thank you very much to all the mLab founders and Everybody that's been so helpful over the years really take an advantage of your Motivation My talk is going to be About a standardization of internet measurements and what policy applications it has I'm sort of it. I'll be the dismal economist So it's going to be mostly policy applications that challenges and trying to convince policymakers to look beyond advertised speeds and In terms of mapping just sort of an overview here. This is a Some distribution. I don't know if you can see it. It's a little small, but we can make it big This is the distribution of speeds using mLab data from this bandwidth with widget that RIP stat created a couple of years ago And I just put it up here in terms of understanding what kind of richness of the market structure This kind of data enables you to identify Here we have Canada and US. This is circa 2015 2016 and as you see there is these little There is no line Okay, you see this is a Canadian case We have a bunch of bumps here around 5 10 15 and this is what we don't have this in the US a few years before You had it in the US and the reason for this is that the US carriers stopped to be speed tearing around 2011 2012 But the Canadians didn't and this has a policy underlying policy reason is that in the Canadian network neutrality they Internet traffic management framework. They really allowed economic traffic management practices to persist. This has changed since then but I guess sort of my point is trying to Extract information about the evolving market structures that are very that vary across countries and in local and regions and localities And why is this important before we go back and the second Graphics is from a One of the nice visualization tools that I don't think is active anymore It was on the Google a public data explorer of the mLab data which was very useful for teaching and also for presentations to policymakers and The reason I have it here is because it clearly shows a variation it in the strategies of the operators rather than just their Technological endowments are important. This issue has not been explored very much in the in terms of the sir quality of service and the speeds They deliver to users to consumers So understanding these strategic variation is very important here. I just want to mention These are the usual suspects on DSL providers are on the left Cable codes and fiber on the right, but what is important is that some Cable companies that could be providing much faster services. They don't Whereas there are some companies that are rural providers that actually are providing the fastest What much faster speeds than the ones that dominate urban areas and I'll just end this Why is this important this picture? Sort of we're trying to bring a consumer-centric perspective to this and as you've heard this battle around interconnection issues With the network neutrality It has it's important, but it has also taken focus away from why is this important for normal people and that this is graphical representation of about 20,000 consumer complaints that my colleagues Carmen from the National Hispanic Media Coalition they managed to get out of the FCC last year during the network neutrality Processes and this is sort of at the center of the legal case right now I guess the point is that the speed and quality are the critical reasons Why people are complaining that they are not able to access their open Internet? And or whatever they want from the open Internet And this is the primary driver of the consumer concerns about title one versus title two and this has just been submitted We submitted this to the NTI a call for broadband mapping That was a couple of weeks ago and or You guys I think yeah, you submitted something too So basically we use this argument and saying you should be using mLab data as a baseline for a big data approach Towards understanding the different emerging differentiation So you can take something some bad data like Sam knows or speed test that shows that has the Essentially the last mile link the servers are installed test servers are installed at the cable head ends or the central offices And that shows and that might be realistic when you are talking about prioritize traffic or cache a traffic Whereas that and I like to thank the mLab Community for keeping up the standards of keeping the off-net measurements Within this environment and you would remain unique in that context, so we'll see if they listen to it and Finally, we'll go through some broadband mapping in about Canada and work with various levels of government and Policy applications and implications that this has had so far I'm Ken by when I'm with Novarum and we're consultants to the California Public Utilities Commission and to the theme of the previous Meeting I spent 40 years of my career building networks including some of the early Cable modems and helping invent Wi-Fi and the last 10 years has been working in these public interest areas educating people about what Wi-Fi is and trying to discover what what the ground truth is a broadband So we've been now Having a program for six years called cal speed which is measuring mobile broadband in California and creating maps throughout California and actually using that for public policy So we were very concerned about the essential metrics of broadband in California And we use that to create the maps you see on the right the upper map is a map of mean Download speed for AT&T The one on the lower right is mean download speed for sprint roughly in the fall of last year We are focused on what mobile We're about to kick off a process in wired and Wi-Fi as well We're put we're gonna put little boxes throughout California to measure Measure wired and by the way all the Wi-Fi that the box can see We're sponsored by the California Public Utility Commission and we are originally Funded by an NTI a grant back in 2011 and the project was successful enough that we're now Using taxpayer money in California, and we're funded by the California assembly Novarum we do the design and analysis and we have partners academic partners at Cal State Monterey Bay Who does the tools and Cal State Chico that actually do the mapping? One of the key things is that 95% of California has no crowds it's rural So therefore crowd conventional crowdsourcing simply doesn't work That's not the problem. We're trying to look at so instead. We've chosen to do is we sample so We go to about 2,000 California locations. We go there every six months We take a bunch of testing tools, which are essentially smartphones these days We run a common testing suite and we come back again in six months later in the same GPS locations So we have essentially have here six years of data in the same place at the same time Roughly the same time to the same servers we upgrade the smartphones because we found that The biggest technology is the chips inside the smartphones for the radios as well as the infrastructure for mobile So we go to the carriers and say give us your best smartphone And we use the carrier data for that purpose We do fork it all the four carriers soon to be three and Because we're trying to look at the complete internet experience We measure to what we call a near server which kind of emulates caching behavior Which is in San Jose and we to go to a far server which emulates the rest of the internet Which is in northern, Virginia both in the AWS cloud Been doing it now since 2012 Open source the code is available online We collaborate the data's on that is available the key thing is we're trying to measure what the user experience is we're trying to say if I were Like know what my experience is. That's why it's important for us to look at both local data and Faraway data because not all data on the internet is cached Because we began early rather than MDT our core engine is IPERF, which is also just as good One of the things that does do us is with all our throughput data is actually at the one second resolution So as we do our tests, we have some very interesting data on how TCP changes over time We our servers actually go to a gigabit we So we can get a gigabit this suit as long as the network between to the server can give us that One of these are proudest of is building our maps you can see here these two maps We found a fish biologist at Cal State Monterey Bay Who was sampling Monterey Bay for temperature and trying to build maps of thermal clients And occurred to me that thermal clients in Monterey Bay is very much like a map of broadband Yeah Yeah, and so we've we found a mathematical geostatistical technique called creaking that allows us to take those 2000 spots and essentially do a Other wonderful analysis that we think of give us resolution down to about the kilometer level with the 2000 spots And we have a zoom technology that we give to enable say a subdivision Give them their own smartphones give our tools and they can go every hundred meters They want to get a finer grained map that's down to the block level and we integrate all those levels of maps together And we do that for throughput latency We construct some synthetic measures such as mean opinion scores for it for a synthetic voice over IP Which turns out to be a very important piece of information And we try to make a difference for opinion makers One of the key pieces this is now being used for is for Opinion makers get attention when dollars are involved So California has a fund that comes out a little bit of tax dollars called California advanced services fund It's a mere few hundred a few hundred million dollars Getting access to that money to improve broadband Means you have to show that there is deficiency using this test And once you put something in you have to show that you've developed and delivered a service again using this test So using your your criteria events Vince you have a interesting success story Because we're measuring rural stuff we have actually data about rural stuff And it turns out that almost on every metric you can possibly imagine rural areas have three-fifths of the quality of mobile broadband as urban areas And because of the fact that we can map against Other kinds of mapping data so for example the map on the lower right is the map of 9-1-1 calls And we can map them where we think high-quality voice calls would happen and it turns out that That's only four percent 40% of high-quality voice 9-1-1 is the map on the top is the map of high-risk fire And it turns out for the two best carriers, which is Verizon and AT&T Roughly for the high very high and severe wheel. They only cover 80% is uncovered with high-quality voice Good stuff So You have three super interesting Detailed talks to go to you Ken's gonna be in the conference room over there Matt is gonna Matt wants to get all three I feel bad for putting these three in the session at the same time But thankfully you're all here to keep talking to each other So so that you know Matt will be over here Res will be over there and Ken will be in the conference room straight across the hall ready go