 So hello everyone, welcome to the next seminar. Today's speaker is Aaron Schumann. Aaron is doing a post-doc here at Stanford with Satin Katti. And he reached his PhD back in Maryland, where his thesis was on observing and improving the reliability of last May collections. And this includes different technologies, Wi-Fi, cellular, and residential links. And today's talk is going to be one part of this, which is how weather affects residential links. And this is the efficiently awarded SICOM Visitation Award. Louise, congratulations. Thanks a lot. So I'm going to talk about what I felt at the time was a really simple, basic question, which is how does weather affect residential internet connections? And what I'm going to show you is that it turns out answering that question involves quite a lot of work. And some of that is actually just building an infrastructure to observe when residential links fail. And it turns out that's even useful for more things than weather. But in the end, I mean, it's a lot to pick up. And you'll see that the actual weather results at the end make up only a small part of the study itself. But let's dive on in. So OK, so we know that weather affects residential links intuitively. Lightning tends to strike antennas. And water actually can seep into telephone lines, as well as coaxial cable. This is a well-known phenomenon that's recorded back to the early 1900s. And that's why we actually pressurized some of our telephone cables. And finally, of course, wind can snap trees and can also cause stress on wires. So why do we care about this? Well, it turns out that customers, when they want to make choices of what provider they need in their area, this is actually a reasonably important thing. As we move along to having TV and other services sent over our last mile internet connections, if they live in an area where they're prone to something like freezing rain, you would want to know that a particular lake or a particular provider in that area doesn't do so well when there's freezing rain, and also how long it would take to repair in freezing rain. And also for the providers, we want to inform them what their problems are. We want to actually measure across all providers and say, hey, guys, there are these little, mini, natural disasters going on all the time. Like it's a little bit of wind, a little bit of rain. When an eventually larger potential disaster goes on, here's the links that you should be worried about, essentially. And maybe you can solve those ahead of time, or Rick says ahead of time. Well, in order to answer the question of how does, so these are examples, by the way, of questions that you could want to ask, how does weather affect residential links? Are these new fiber deployments more robust to something like wind? Does rain not thunderstorms, but rain itself, does it correlate with the internet outages? And are these extreme temperatures that we've seen, are they correlating with federates, and how do they correlate with federates? And the tough thing is, to answer these kind of questions, you need to observe huge swaths of locations, like you need to observe different weather in different locations. There's more snow than there is in North Dakota than in California. You also need to observe lots of link types. You can't just say it's not very interesting if you only look at DSL. You might not look at DSL versus cable versus fiber versus wireless internet service providers. So we need to get a lot of data. And that was the first challenge in this work. And also, one of the other reasons you need a lot of data is that I kind of think that each residential deployment is like a pretty unique little butterfly. And I don't know about you guys, but when I'm walking down the street, I spend most of my time looking up at telephone poles and looking down at infrastructure to see in this area, what kind of infrastructure do we have? And here's an example of some photos I took when I was walking when I was in Maryland. This is a Comcast deployment. So basically, in the head end or in the main Comcast building, you have the CMTS, which has a fiber link which goes out to this, which is the node. And it's kind of hard to see the fiber link. It's one of the thinner cables that goes into this. And then the node does a media conversion from fiber to coaxial. And then there are all these splitters hanging off of it, which look like they're really like well thought out of what splitters need to put out at all. And then eventually, that splits off and goes through a few amplifiers, if necessary. And then eventually, you do a final set of splitters that split off into the actual homes. And if you look across cable deployments, if you look across PSL, et cetera, the infrastructure is different. It's different brands. It's deployed in different ways. There's different splitters with different ages, though cables are new and old. And some of them, it looks like they're completely falling apart, that I've seen here at actually in Menlo Park. So just looking at one location's cable deployment is not useful. We need to look at the broad cable deployments in the United States or wherever we're observing to answer how does weather affect it. So here's our approach. Well, we sort of view weather like this really natural experiment, which it is. That basically, when the weather rolls in, hey, you have a forecast of it, which is really nice. So you say, ah, here comes the weather. I'm going to start looking at these residential links in this area where the weather is. So I'll look at a bunch of different providers, Comcast, RCN, whoever's there, Verizon. And since I want to look across all these different link types, I can't use one probing technique that only works on fiber or only works on cable, only works on wireless. I need a universal type of measurement tool. So I just use ICMP pains. And we'll see that that creates a lot of headaches. But it's, I think, the best tool for observing whether links are failing or not across many different types. Again, there's a lot of links out there. And not all of them are experiencing weather. And so I wanted to be careful with my resources. I want to make sure that when I'm probing something, that that link has some chance of experiencing weather. I don't want to waste my resources probing other links that won't experience weather. And finally, I'm going to look at those ping losses. And I'm going to look for abnormal sequences in the ping losses. And those I'm going to interpret as failures. And I'll tell you how that works. So here's the general flow of the actual analysis collection of data. So we start out with a ton of data. Ooh, I haven't laid the corner. We can start out with a ton of data here. But we're pinging the links, and that's why we have a big arrow. And then we filter out broken vantage points. Because it turns out even the vantage points can be a source of errors. Now our arrow gets a little bit smaller. Then we try to identify, we take those things and try to reduce them down to states. What is the state of this link right now? Is it up? Is it down? First we do that by finding downs. Then we pull out ups and this weird thing called post, which is the source of about a 1 and 1 half year headache. And I'll explain that in a little bit. Then finally, we add these together to find failures. When you have a up to down transition, there's a failure. Yeah? OK, so how do you think the links are referring to the data based on an active suit? Because it could be a link down or it could be a port, right? So we are directly, but maybe you can have a big answer. Sure, do you mean a port at the head end or a port actually? But you can just connect with something like that. So that's what you're talking about for probes and how they work. Yeah, so we're probing from different vantage points. So at least we're trying to isolate that the last mile link would be the source of the failure. But certainly it could have been. For instance, if there's a power outage, then your router goes out. So does your cable modem, but your router goes out in that instance. Pretend just your router ran out. Sure, that could actually be a source of noise in our data. If just the thing that's connected to your last mile link failed. But we're probing so many links, the hope is that that sort of noise ends up being lost. And not in a source of signal. So maybe I'll ask the same question in a different way. Is the only thing you're pinging, the actual IP address, is CPE router? Or do you have access to the ICMP stacks of the intermediate nodes, whether they're IP? Yeah, I love that. You're only doing CPE. I'm only doing CPE. And you infer from there. Sorry. And you infer from there. I infer from there. Again, I use multiple vantage points to try to minimize the effect of some other links on the internet being broken. Again, I want to do this across many vantage points, many locations. So in order to, sorry, many locations, many weather types, and link types, in order to do that, I need to do something that doesn't need proprietary data, that doesn't give me, although I love to have access to Comcast stats on that interview. Actually, for validation, that would be very cool. But I'd like to talk to them about how I can maybe make some of that data public. I'm sort of a big, and it's hard for me to do studies where I can't make the data public. So I'd have to figure out a way to work that out with them. But anyway, all right, so then finally, power outages could, of course, be a source of values. And we want to look for network outages. So I try to remove them. And then I'll give you a short explanation of how we do that. And then finally, we get a correlate with weather. Took all of that. What a pain. Yes? Is it weather often caused power outages? Yeah, it does. So that's both local and, I mean, at multiple scales. Like, one, it can cause your power over either end of the line. Two is, you know, anywhere to your endpoint all the way up. Right, so we filter power outages. We're trying to filter power outages that are more like reasonably large-scale, like, in neighborhood or an entire metropolitan area. Those are the ones that we filter out. But a very local outage where just one person's home is affected by their power is affected by some weather. I mean, it could, again, be a source of noise in the data. But I think that that generally would be much lower probability than other effects of weather. All right, so let's start with how we do the probing. So NOAA, the National Weather Service, has this really cool feed of all the weather alerts in the United States. It's a nice little RSS feed. And what it includes, this is an example of one of the alerts. The fields that we care about are, when is this alert effective? When does it expire? And for what do you know in what area does this alert cover? And again, what's really nice about this is these are coming ahead of the actual weather itself. So we have a nice little forecast, and we can direct our pings into that area based on these forecasts. So the other day, I looked at my iPhone and it said that the weather is going to be completely clear all day. And yet it rained all day. You actually validate that the weather that you predicted actually occurred. Big time, yeah. And so we use this to aim our pings. We use a completely different data set on what was the on the ground weather afterwards. Because sure enough, the forecast or the alert has potentially nothing to do with what happens in real time. So yeah, I'll explain that in a moment. So now another question is, how do we find all these residential links to ping? Well, we did a nice targeted reverse DNF stand-up internet where we start by scanning all the slash 24s and we try to find at least one IP that has a domain name or a known residential provider. Then we do some filtering where we have a sort of reasonably sized set of common host names that would be used for residential links. Things like pool is part of the DSDP pool or something like a DSL, the actual type of the name. And we filter out all those routers and other things that are IPs that belong to residential providers. Then the question is, well, how do we, by the way, we've found 100 million IPs doing that. So the question is, how do we then figure out where those IPs are located? And this is something that I'd like to improve. But for right now, it's kind of tough. I need to know ahead of time where the IPs are. Hundreds of millions of IPs. How do I do that? I have to use something like a database of IP geolocation. I'd ideally like to use some IP geolocation technique where I can, on each of those IPs, try to figure out where it's located. But again, when the alert comes in, I need the set of IPs that I'm going to probe right then and there. So I use MaxMind geolocation database. It could be better, but I think weather is sort of a regional phenomenon. So even if there is some in MaxMind, it still should give us a reasonable estimate of where these IPs are. So for example, my concast IP shifts sometimes. It tells me I'm in their Chicago wall, et cetera. And I'm like, that's weird. So there are some problems with MaxMind. And actually, one of them is that for a particular IP, when you do the reverse DNS name, you actually sometimes get two different reverse names for the same IP. And I don't believe that. So we actually do try to make sure that, and those for concast, for instance, it will say California and Maryland.thisIP.thisIP. So we do try to make sure that when there are some of these duplicate names, for instance, that maybe messed up MaxMind when they did their reverse DNS probe to guess where things are located, we try to correct for that. But yeah, you're right. It is possibly a source of error, just that MaxMind is not perfect. But we believe that it is somewhat reasonable to do this. So we do sampling. Whenever a weather alert comes in, we only probe 100 IPs from each link type and provider in that area. So again, Verizon can have 5s in DSL. We make sure we're probing 100 IPs from each. And finally, in order to find a provider in link type, the way we do that simply is we just take the numbers out of the reverse DNS name. And what you end up with is usually this unique provider and link type name. Like, here we have DSL and Verizon. So that tends to work pretty well. And we've done some checking in our data to make sure that it does. So all right, the final thing we do in the probing is the actual cleaning itself. How do we do that? Well, again, I believe one vantage point is not enough. We're looking at the last mile link. And in order to make sure or hope we're making sure that the last mile link is actually what's breaking, we send from 10 vantage points. And we use play a lot just because those were available and also why we deployed. We also ping in frequently. And I've done a lot of internet measurement studies. And I've gotten a lot of complaint reports. What's really cool about the pinging in frequently part of this is that over three years I've been pinging, I haven't had a single complaint report. Which is a really big deal for me, at least, because they can get pretty hairy sometimes. And then plain and mild can start blaming you for causing all this. And I mean, it can get really ugly. But this has been running for three years, and we're not seeing any of that. No problems. And we stole that number 11 minutes from John Hyden. He'd use it for another study, so he didn't get a complaint report for us, too. Also, we want to omit needless pings. So basically, if there's an IP when the weather comes in and we're probing it, and it's not up right then when the weather's probing it, we drop it. And we try to only focus on those IPs where the weather is actually, where it's alive when the weather is coming in. And finally, we know that pings, of course, have different reasons why they could be lost. So we, in our probing, actually do some extra work where, when a ping is sent and it comes back, and then immediately after that, if we send another ping and it doesn't come back, we retry with the exponential back off the ping from each of our vantage points. And that just gives us more data to see if that ping is actually a real failure or not. Indicates a real failure or not. So the data set we have, four billion pings, over 3.5 million IPs, and 400 days of data. This, again, is not three full years that I've been working on it because we've had to take it offline to fix some bugs here and there, but that is the data set. And I'm the kind of person that I believe it's tough to fully understand a data set without looking at it, so I'd like to make giant visualizations of my data sets. So again, we're only probing the United States for this study, and here what you're seeing, each one of the blue dots, and I'll start animating this in a second, each one of the blue dots is one of the hosts and their transparent, so basically if there are a lot of hosts in one area, it'll show up brighter, if there are newer, it'll show up dimmer. And I'm gonna show you the month of December. So, as you can see, here's our thunder ping, that's the name of the prover. It's adaptively, as it sees different weather conditions from the NOAA feed, it's going in, it's picking different areas and it's pinging them. And by the way, when you see the red show up, that is indicative of just a basic metric where more than half of the vantage points reported that the host isn't alive. And there's some really crazy stuff that happens, and I think, I don't know if it already happened, up in Minnesota, you might see it in a moment, did it already have the big explosion of lots of dams? Yeah, so there was a really bad snowstorm that day, and that's an example of a power outage in this data. Usually when you see the giant flashes, those are the power outages in the data. It was cold in California, around that time? Yeah, yeah, sure, sure. So it's, December was actually kind of a particularly interesting time to study, and at the bottom you're seeing the fractions of, this is a percentage of hosts in that state that were up, and then this is the number of hosts we're pinging. So we're pinging thousands of hosts in each state at any time on any day, really. So, okay, so that's the basic idea. Cool, so I'm gonna start out by just giving a short overview of how we filter these things, and I'm not gonna focus too much on down, because down normally looks kind of like this. You have all your 10 vantage points, that's what's on the y-axis, this is a Comcast host that I was pinging, and on the x-axis, sure, you see that a few pings are dropped, but generally when there's town as there should be, all the vantage points stop seeing pings coming back, and by the way, you can see our little retransmissions that go on there from each vantage point. Again, we do exponential back-ups to make sure we're not arching too much, and then we see it comes back alive at some point. So this is a down, it's pretty easy to find, I can explain later on if you'd like to see something. It's like an actual crab, so you claim that the links come out after seven hours or something? Yeah, that's right, no, this is from our dataset, straight from the dataset. Okay. Yeah. So does that result in eight hours or something? Yeah, this was, I believe, this might have been during one of the tornadoes in 2011, so I think it was prolonged outage from there. Just makes for pretty data, but, excuse me, generally it's, you're just wondering why it's a seven-hour outage. No, I'm wondering that everything comes out at almost the same time, man. Well, I mean, again, this is the same, the same link that you're proving from time to time. No, sorry, yeah, I saw this, I actually clarified. This is one link that I'm proving from 10 vantage points, so that's why they all come up at the same time. So the red dot in the previous slide was outage. Sorry, the. Red dot was at outage when there was a red dot. Yeah, red dot indicates that from those vantage points, a majority of them indicated that it's not alive at that point. Okay, and do you have a measure for quality degradation rather than just outage? Yeah, so we're getting to that in a second. That's, we're interested in hearing. So this is the easy thing to deal with. I can find these down periods somewhat easily. And where it gets complicated is this other behavior we observe, which we call host. And that's sort of what you're saying, which is like a quality degradation of some sort. What does host look like? So host kind of looks like this. It's really strange, and I've seen this a lot, and it's very not intuitive for me to explain. So we have it up, the host is up, again we're pinging one host from 10 vantage points here. They generally look like this. So host is up, then all of a sudden there's a bunch of lost things, and then you go into this weird sort of, there's some lost probability for each ping. And again, all of a sudden it kind of comes back up again and then all the things are coming back. You don't see them like we've congested. So, but you don't see a change in RTT. And here's an example. So the color that's shown on the y-axis is the RTT and you see that the RTTs don't change. I mean if the bumpers are small, the RTT won't change. So that is one instance of what could be a source of the problem. But as we'll find out later on, this actually correlates quite nicely with other, depending on different lengths, correlates in different ways. And I even have done controlled studies at home where I hooked up attenuators to my cable to try to figure out what could be a source of this. And it turns out there is a reasonable explanation for it that I'll get to at the end. Yeah. For some lengths, like for DSR, there will be anything. Like a retransmission, you're saying? Yeah, so the modem will lose synchronization. Yes. And the retrain does an output, let's say 30 seconds. Yes. And that will come back on. Right. So I mean, if that was the case, that could be a source of this for sure. If you have degradated signal quality and then your modem is having to reconnect all the time, yeah, that could be an example of what would cause this for sure. And that's fine. To me, you know, if there's a degradation signal quality in the link because of weather, and then you end up with this, that's totally reasonable. And it seems like it could happen. And that's actually, we'll talk about that. And it sort of answers this question too, and I'll get to why I think that happens and what our assumptions are about lifestyle connectivity that we might need to change. And we are already changing, yeah. It could also be the link layer after you've lost, I mean, these broadband aggregation boxes that do the PPP sessions, sometimes they have tens of thousands of endpoints on one big red box style or Cisco style box that if they all get hammered at the same time, they go hammer radius to re-establish the link layer. I don't know. I don't know about an hour. They do this on the scale of an hour and then with probes that are coming every few minutes. I mean, that is, it might be the case, but I think these longer term, these are not short-term benefits, they're really long-term failures. It's just that if the phi and link layer has its own form of exponential back-off after failures and so does the PPP layer, all these are unseen by IP. Right. They're underneath. I don't know about an hour though, you're right. I would expect the duration of it. Yeah, sure. Okay, so how do we, the tough thing was actually trying to find those states in our dataset and isolate them. And it turns out, this kind of makes sense why it's tough because there isn't really a particular loss rate that happens. In fact, in one sort of period of this host phenomenon, the loss rate can change by quite a bit. Like what you're seeing here again, this is for another host from our dataset. We have the vantage points on the y-axis, on the x-axis we have time. And we see it's a bit tough to see these x's, I apologize, but we see that there, for each vantage point at different times, there are these losses and then eventually a nice little recovery at the end. And when I'm plotting up at the top here, this is the smooth loss rate that vantage point is experiencing. Or that this host is experiencing. We see again that there is all these different bumps in it. And so how do we do this? Well, honestly it took over two years to come up with the answer to this question. And there has been some previous work on this, but it was never involving paying data. It was always involving, so they have this concept in previous work of these transition periods of network activity. Where basically you're getting, let's say the same throughput, then all of a sudden you're getting a different throughput than in the end you're getting the same throughput again. And when you have one number metric like throughput, you can sort of find these transition periods using techniques that they came up with. But we don't really have a one number metric here. We're actually looking at all these different vantage points and we have to figure out what the loss rate is somehow. And even coming up with that loss rate is unclear how you should do that. You do it over a big window, little window, et cetera. So what I figured out is that this is actually really similar to edge detection in computer vision, I apologize. So essentially if you think of sort of these different things as pixels, and when you get to the point where there are some losses, it's kind of like finding the edges in a noisy image, okay? And so we can apply actually the sort of most famous edge detection algorithm, which I believe can be edge detection. I never know how to pronounce it, but I believe that's what it's called. And what you do is it's pretty basic. You apply this Gaussian smoothing on the actual ping data where we assume each ping as if the ping came back, it's a one that says there's a loss. And that's what you get in the top white line. You have that smooth ping data. And then you find the first derivative of it and you look for the peaks or the maximums in the first derivative, pretty straightforward. What you see is really nice. We find that right here is exactly the point where that first ping was lost and we have this huge inflection point there. And then also at the end, we have a nice inflection point in the derivative, the next maximum there, which indicates the end. So also in the middle, there are different changes in the failure rate and we also identify those. But what we do is if we see an up integral and then a hosed integral and then a hosed and a hosed and a hosed, we just group those together as one hosed. All right, so that was hosed, hosed is a huge pain. And I, go ahead, yeah. How much lag from where the hosed and that ends you need in order to be able to detect it? So, you know, you wouldn't know whether the time period there, if you left them in intervals, one went to intervals and went out again. But that's in your data set. Are you saying if it was like a real time? How about a real time, yeah, that's the question. If I wanted to do a real time, I would probably have to use either this technique with a much skinnier smoothing to make sure they're finding those really fine intervals where something may have happened or a completely different technique. This is an offline technique and you're right. What I would do though for probably, again, one of the plans with the study is actually to have the data in some real time interface, probably with a lag of let's say an hour or so, posted online. That's what I would do. I would just add some lag and I think it's still useful even if you had that lag. But yeah, I mean real time would not be so clear that you can do it with this technique. I just think I would add threshold line. Yeah, so the threshold line is determined just by collecting a lot of the data and seeing what makes a reasonable threshold line for finding the hose. There isn't really a medical technique for doing it, it's totally imperative. Yeah, it's not, we're not able to, I mean, it's basically- Because that would determine also the frequency of it. Yeah, yeah, that's true. So I don't talk about it here, so what we also apply at the same time as we apply this, so it's only a little number here, but sigma six, that is the Gaussian sort of filtering that we're, the Gaussian smoothing that we're applying. We also apply a much wider smoothing at the same time. And this is actually drawn from previous work. They did something similar where basically you try to find the sort of quick changes and also the long period of time changes. And when you combine those two, and essentially you will find then the, you'll be able to filter out some of the shorter period of time changes that don't actually result in a longer period of time changes. So what kind of diversions do you get from that? I mean, what kind of a circular- Like how does it help to have the long period of time? Yeah, I mean, I didn't find those peaks. That doesn't have to be... Oh, it's important to us because essentially this, we believe, well, we believe that this could be caused by the weather, these host intervals. And then essentially it could be a result of a degradation in signal quality on the light. And honestly, we didn't expect this. We thought that it was going to be like kind of general internet loss that would be going on. And then if there was an interval like this, it would be associated with one of the high RTTs. And that actually wasn't the case. Yeah. Did you correlate this with actual, the weather, what happened and the weather? Yeah, I did. I'm going to get to that at the end of this. And then the follow-up question is that, do you know what kind of weather is great? Is it, you know, spies? Yeah, we don't know what she would be talking about. We split it up by mother, yeah. Okay, cool. Okay, so, and that's what I'm going to get to right now. So, just to give you a quick idea of what we do to remove outages, it turns out the government maintains this list, or power outages. The government maintains this list of power outages that are self-reported by the power companies. And what we did is correlate that list with our up to down failures. And what we, we have this hypothesis, and this is our assumption, is essentially if you have more than one IP from more than one provider that failed at approximately the same time, that that is likely to be a power outage, whereas if you only have one person that's failing or only one IP that's failing, that's likely not to be a power outage. And we correlated with this power outage data. And we said, we observed that for two IPs, when you have an IP from two different providers that failed at approximately the same time, that's very likely to be during one of these no power outages. So that's our threshold. If the interval, if the down interval comes at the point when there were two IPs from two different providers in that location that failed, then we knock it out from our data set. This is pretty, it's removing a lot of data from the data set. Oh, that's interesting, because I would have almost thought that that would, that doing the opposite would be a good idea. Oh really? Which is that if you have, you know, the idea is if you have, if just Comcast failed, maybe Comcast is messed up, right? But if, you know, Comcast and some other ISP, well, you know, sort of multiple ISPs, then it's less likely to be a problem just with the one ISP due to configuration or overloading their aggregation point. Sure. You know, basically the problems that I always get when I call Comcast, and they just loop some switch so that we've forgotten to flip the MIGI at work switch. Sure, sure. So I mean, it's, again, I think there's a big diversity of how these links are deployed. And, you know, even in one area, you have some links that are undergrounded, fiber links, you have some links that are on the telephone poles and different equipment. And to me, it doesn't seem likely that you would have these correlated failures unless there's a correlated source of the failure in terms of some property that they both share, some resource they both share. And to me, I see that as what our... You feel like multiple wireless ISPs, for example. Yeah. They're all sharing the error if you have... Yeah, but they do... If you have a place where you have, you know, where you actually have an open cable or DSL market at multiple providers, you know, same sort of thing, they give people that are sharing the same conduit which is also common as well. Yeah, I mean, sharing the same conduit... Yes, sharing the same conduit, I totally agree with you about. The wireless thing I actually don't agree with, it turns out, and I know, I've sort of worked with these wireless internet service providers a lot. They kind of cut their territory up from each other because they're usually using unlicensed spectrum. So it's unlikely that you'll have one that's provided by more than one. It's not that it doesn't happen, but it is generally... I mean, on my iPhone, you know, I've heard what at least five were... Oh, I'm not probing cellular links. Oh, okay, I was going to say it, no. If you look at your Wi-Fi, you can see, you know, I see, you see like AT&T, I see Comcast, I see, you know, I see... I see Starbucks, if you look at cellular, well, also, I mean... That's true, but I'm probing the residential links. So I'm probing mostly wireless links as well as like face point-to-point wireless ISPs. So I'd be probing the router that serves that AT&T most, not your iPhone, that's actually, it makes sense. But yeah, I mean, it is the case that there could be conduit that they're sharing where you have a correlated bit. That definitely is true. I think that that's... There's been such shared infrastructure at these certain places that they have expanded. In last mile, yeah, I mean, there's, for instance, you have the like CLAC where you have a telephone company that has a lot of DSL companies that are providing from the same infrastructure. Yeah, that's an example of something that we're going to have. And I'm assuming somehow you also figured out any network disconnects, right? You're disconnecting because the provider has some issues, not whether or not they... So it is... Or somebody disconnects because they want to move to another provider, right? So we sort of, we try to treat that as noise because it's very difficult for us to control for something like that. The assumption is that would happen across link types, across providers. And so that should affect sort of all of our data and across weather questions as well. Can we defer questions on V1? Yeah, there's not going to be a second. So, all right. That's weird. Do you see the screen changing? Is that it really quick? I don't know, maybe. Someone look on Twitter on the earthquake bot. All right, so, okay. The... Now to answer your question from the beginning, we use this data forecast data from NOAA in order to figure out when we probe, but actually figuring out what the weather was at the time when we were probing, we use airport weather stations. This is a picture of an airport weather station that's called ASOS station. And I honestly, at the beginning of this, didn't quite understand where our weather measurements truly came from, but this is one of certainly the primary sources for things like what type of precipitation, how much fog coverage, et cetera. This thing is my favorite measurement device, the precipitation identifier. It has basically a light that's shining into some sort of sensor, it's like a camera. And it can figure out what's falling between these two sensors. That's how it figures out if it's snow, if it's rain, et cetera. Very cool, I think that's a really cool front. Visibility is also kind of, I mean, it's not important or anything, but I also think it's a cool little toy. It just sort of has a laser, and then it checks to see what the reflection is off of what's in there, pretty cool. Anyway, so we have all these real, actual measurement devices, and they spin out these things, and if anyone in here is a pilot, they may be familiar with meteor, retard, whatever people call it. These are the sort of hourly weather reports that come from these weather stations. Here's an example of an area I think this was high-staff Arizona. And essentially, over time, each of these weather reports we see that went from clear to scattered clouds to days to thunderstorms, and then eventually have a heavy thunderstorm. So these automated weather reporting stations are the source of our ground truth data. And if you want to take a look at kind of generally where these are deployed, this is a view on the East Coast. This is when Hurricane, I think it was Irene, was coming up the East Coast, and I just plotted like a general concept of what the weather was. So you can see that, you know, but as it was coming up into the general Maryland, New York area, we had a lot of rain in those areas. And yet, each one of these diamonds is where one of those airports are that we're getting our data from. So, I mean, this is sort of the reality of weather truth data. It's not actually, there are other data sets like weather underground, but I wanted to be really careful to only use this curated sensor pool. And these airport weather stations are maintained by now, and they really want to make sure they're operating properly. Some of these airports even have weather sort of like operators that are maintaining that station and making sure it's working at all times with the brakes, you know, planes have trouble landing. So it's good data, that's for sure. All right, so finally, let's look at some other results. Okay, so let's look at wind, first of all. So this plot, and I apologize for what happened with the satellite guys that were coming here. It's a little bit difficult to see, but on the left-hand side, we're looking at failures where we go from up to down. The right-hand side, we're looking at failures where we go up to hose. And we have the different link types that are shown with the different glazes. And you can see what we're looking at, by the way, is wind speed versus probability of failure within an hour. So if you're looking at one IP in an hour, what's the probability that it will fail in that hour? And there are a few really cool things here. First of all, if you look at up to down failures, the relationship between wind speed and the failure rate is this nice, nonlinear relationship. And that actually makes sense. And at least you know, all data makes sense. You can come without some explanations, right? But here's my explanation. The drag, basically the relationship between wind speed and drag is a quadratic relationship. So it's not entirely too surprising to imagine that, if you have, let's say, telecommunication lines that are sort of being brought, dragged by wind, that the higher the wind speed, the more likely we'd have a quadratic relationship with the actual failure rate. So that's pretty cool. And that, of course, is across all sort of length types, DSL and cable. By the way, we, you know, I can explain how we've got the length types. It's kind of a sad story. To be honest, part of it involved looking at over 2,000 websites for different ISPs and seeing which ones provided different length types and only filtering using the ones that only provide one length type to make sure that we're only looking at a particular length type. That was a huge pain. I'm really tired of seeing pictures of people laying on their bed with their computer. They have all this default imagery that you can use on these bunches, really annoying. But anyway, with post, interestingly enough, there isn't as clear a new relationship as there is with down. And of course, with satellite links, in both scenarios, it actually turns out, I think it's, bless you, this is one of the large satellite ISPs in the US called a wide blue. This is one of the main ones that we were probing. And there's always interesting stuff happening in satellite, especially when we get to temperature. This is pretty fun. So here's an example. I'll go with satellite first, because honestly, we kind of know, by the way, that satellite is affected by lots of different weather conditions. First of all, because it's really high frequency. I think wild blues operating in 20 gigahertz are something so high frequency that rain fade and fog and clouds and all these other weather conditions are obviously affected. But the temperature one was really weird. I looked at this and I was like, what the hell is happening? Yeah, same. Yeah, and same for down and host. So here's the kind of crazy thing. This is my, again, explanation for the data, which of course you can come up with a million. I think this is likely to be sun outages. So one of the things about satellite internet connectivity or any satellite connectivity is that at some points during the year, the sun will be sort of relatively in line with the bore site of the satellite. And of course, the sun is this giant electromagnetic emitter. So this is a well-known phenomenon. Satellite connectivity will have these known outages at a known point in the year, depending of course on what your location is, what size dish you're using and where the satellite is located on the horizon. So my intuitive belief, so for wild bloom, where are their satellites deployed and generally in the United States, the early fall and spring are where they have their sun outages. And so I believe essentially the fact that you're seeing 70 degrees here is kind of like you're seeing early fall and spring. Whereas in the middle of the summer, you're unlikely to see a sun outage. So that's why the probability sort of drops off in the middle of the summer. Same thing for the winter. So presumably this is due to the day, right? Or is it the same spots? This is all the time. All the time? Yeah, all the time, that's correct. Now another cool thing to note here is that let's look at X, what's the wireless internet service providers. For up to down failures as the temperature increases, it's actually not so likely that the failure rate will increase, but for the host failures for these wireless ISPs, again, these are ISPs where you have these sort of fixed point-to-point wireless links between house and a sort of central access point. These are used in rural areas and even now in many metropolitan areas. So we see this nice relationship with that weird host thing that I saw, and temperature, as temperature increases we get more and more in this host behavior. It could be because these links are deployed outside, the equipment itself doesn't have air conditioning, can't handle the, I'm not exactly sure. But it certainly seems like there is a relationship there. And the other cool thing is that at the extremes of temperatures, we see that that's where this non-linear sort of behavior where for the other link types, at the extremely cold and the extremely hot or where you see some effect, except for fiber, for up to down, up to down, and with fiber it's kind of across the board. Not, doesn't seem that it's that affected by weather. Yeah, okay, we have time. We have time. We have one year left. Yeah, sure. Okay. Let's watch next year. What was that? Two more minutes. Is there only two more minutes? Pardon? I'm gonna watch next. Okay, I only have one more slide left, so. All right, so this is now, now we're looking at a continuous weather or we're looking at amount of rain, amount of heat. We're looking at different types of weather. And I call this the survivor experiment. Basically, the question is at a particular hour for a particular host, when it was experiencing, let's say, a thunderstorm, what's the probability that it failed? Okay. And the air bars that you see here, I compute these probabilities, by the way, per day in our data set. And the air bars you see are 95% confidence intervals on the total set of days that we look at. So how much per day is that probability of failing in each weather condition changing? So, and by the way, you see on the top, you see how many actual days that we have observed this weather for the different hosts. There is something like tornadoes. We barely see them. There are only a few days where we see tornadoes, and that data is kind of funny. That's why the air bars are gigantic. So let's start out with the first thing. Fiber and cable generally have the lowest failure rate in clear weather. It's just kind of a simple basic observation that I was a little bit surprised by. It kind of makes sense in that the infrastructure is newer and it is a wired infrastructure. But that's where we stand there that has nothing to do with weather. It's just a basic idea. Next, of course, satellite as we had hoped has the highest difference between its failure rate in clear and its failure rate in rain. Again, that's expected because of rain phase. Next, we see that freezing rain, which is the dark purple line, actually has a pretty strong effect. I believe it's the dark purple line. Yeah, that's right. Sorry? So this actually has a surprising effect for cable where the air bars are nicely separated as well as DSL. But even for the other link types, it does appear that to me at least freezing rain affected more than I expected, which kind of again makes sense for up to down failures because freezing rain is likely to freeze wires and then wires can snap. So that seems not too surprising there. But, and generally also thunderstorms are bad for every link type. Even though we try to filter out the power outages using those really conservative measures of power outages, is it still likely there's some power outages in there for sure? But again, it's a really, I was really surprised that we saw, by the way, before we did this filtering, I think we saw generally a four times higher probability of failure in thunderstorm compared to clear. And once we did the filtering and went down for some of them by two or less times the probability. Where's there no tornado data for fiber or wisps? Which one? Fiber and wisps are tornado data. They have no tornado data? There's no tornado data. Yeah, yeah, yeah. So that means we never, we didn't have that so far, didn't we? The tornado data is very sparse. There are only a few like sort of, the 2011 was really bad. And that, I think is where you're seeing a lot of the satellite dial-up customers. That was in Oklahoma, I think. So finally, you know, this data set is also kind of interesting to observe other phenomena. Like this is when Hurricane Sandy came over to the East Coast. And here you can see that, again, this is most likely power failures. This is the raw data itself. But as it enters over New Jersey, the whole sort of New York metropolitan area goes out around the same time. And the thing is like, we're one of the few folks that are observing these events. So we're just the ones that have the data and we can answer a few questions on. By the way, I mean, it was a little tough to see what it was crossing, but way off in Pennsylvania, they were losing connectivity quite early on, even before the tornado hit the shore. And it looks like there might have been an ISP in the Washington, D.C. area that turned itself off somehow before the storm came. I'm not sure what the benefit of that would have been, but it definitely, if I reverse it again, you'll see that it sort of turns off way earlier before the other ones did. Yeah. And the last one's concluding, so I can do that. Ah, we're a minute over, so I'll just conclude. Okay, so generally what we did here was observe weather and try to figure out how weather relates to failures of residential links. But in the end, that resulted in us making a tool to figure out what failures are, when failures occur in residential links and for a broad set of residential links. And I think this is applicable for more than weather. And I'd be definitely interested in hearing you guys' thoughts on what else do you think you could apply this data to. Also, we found these interesting relationships between different weather conditions, some nonlinear stuff that I didn't expect, as well as this interesting linear behavior for rain when we go from up to down, but not a linear behavior for hose. And I believe that's actually not something that I showed in the data, that as the precipitation in inches increases, you have a linear relationship for up to down but not for hose. And finally, the link type definitely matters even for our wired links. And this again intuitively seems like it could make sense. DSL infrastructures on older telephone lines, the cable infrastructures newer. Even though theoretically they're kind of similar, in that the way that they're deployed, it definitely affects the relationship between the failures and whether or not. Finally, the hose thing. And this is to clear up the point that you guys made earlier on. Here's what I think. What I've observed is that for most link types, DSL, cable, satellite, fixed wireless, they assume link budgets, meaning they set a fixed modulation rate and coding and that's what's used throughout like the entire running of this network. And if the link type, or if the link conditions change, then you just lose packets. And that's why when I ran my test at home, I was using 256 clam on my DOCSIS 3 cable modem and as I increased attenuation, I'm still using 256 clam on my DOCSIS 3 cable modem because that's how these networks are designed. They're designed assuming a link budget that basically, at worst case, the signal probably degrade by X. And I think we have to start asking ourselves maybe the observation of this hose phenomena makes us wonder if those link budgets are, first of all, if they're adequate. And second of all, if we should add rate adaptation to our protocols. And it turns out DOCSIS 3.1 is adding it. And I think a bunch of these protocols within the next few years will also add it. So this issue of host, I'd like to see on the long term it starts going away. My expectation is that it actually might. And, but again, I was super surprised. Intuitively as a wireless guy, like when I turned down the signal quality, the rate should drop. Well, it's true for wireless in general, but most of your phones and base stations, access points, we'll do that. Exactly. And that's where my intuition came from. But when I, I'm looking at this link, a coaxial cable link is essentially a wireless link on a cable. It's not doing much different than that. And sure enough, they don't change it. And I think I've talked to a few people about this. They claim it comes from the sort of broadcast mentality of these providers, especially the cable ones, which is just, you know, yeah, when we send our TV signal to everyone's house, we don't send it to different rates. We send it on one rate. And, you know, they get a better antenna thing when we fix it. You should put an inline that amplifier, you know, you're having a link budget issue. So, okay, that is the conclusion by the top. Any questions? I wanted to end by the way with this quote, which I think is kind of telling, one of the questions often comes up with this is, why not just underground all the cables in the United States and how much would that save our lives? And the fact is, budget-wise, the way the United States, the size, where people live, it's incredibly expensive to underground. And in fact, just completely prohibitively expensive. For new communities, you can do it. But for old communities, the cost is insane. You know, six times the cost of the actual infrastructure right now, 25 years of deploy, this goes really slow. That's not that much money, if you're like Apple, it has to happen, yeah. It's only two dollars in the bank. Or, you know, I mean, large companies like Apple or Google, 40 billion billion, they can put in there. This is just in one community in North Carolina. It's 41 billion dollars for North Carolina. Oh, this is in one U.S., I was like, yeah. That seems small, we should just do it. Yeah, you're right. It's 41 billion for one community in North Carolina, which is like... It's all right away, right? It's all right away. Right away is the biggest killer for underground, yeah. Question on the data, was there some element of predictability? Or we can predict it out of it or something? Yeah. So I think we can. And, you know, from these probabilities that we observed, that we see somewhat consistently that the probability of failure and rain for DSL is tax. And by the way, if we look more carefully and look for location, we might even have better guesses of what the probability is. So I think we do have the ability to predict. Now the question is, what do we do with that? And there's, to me, you know, for content distribution networks, there might be something for peer-to-peer networks, there might be something, but again, I'd be interested in your guys' feedback about what you think you could do. You know, we can predict the weather, so weather's gonna, rain is gonna come. What do we do to prepare for that in our last mile links? Do we make sure that services that are peer-to-peer that are running on those links get moved to some other links? Skype removes its, you know, super helps with whatever they're called from links that are about to get rained on? I don't know, yeah. So the two ways you could possibly... So I'm really interested in your edge detection, you know, post model. Sure. There are a couple of obvious approaches that you might be able to validate this model. One would be, sort of, versus ground truth, you actually, you know, have some validated data that you've measured it and said, well, actually, yeah, we went to the cables, we looked at it, and it was, in fact, water-stupid. 100%. And so, and the other would be sort of a mathematical model. And I guess, my question for you is like, what approaches, you know, did you take or are you thinking of taking for validating your model? So for the cable one, in terms of water, getting in the cable, I actually am setting up the experiment now to inject water into cables and see what happens. Which is gonna be a lot of fun, and hopefully I don't kill gas in the process. I mean, I'll try to figure out how to isolate myself as much as possible, but... So you're gonna do experiments to try to get the ground truth and validate the model? 100%. But I'll tell you that, you know, I've already done a few experiments with attenuators, where I'm trying to simulate some kind of attenuation. Basically, when water gets in your cable, it's not actually attenuation that happens. It's an increase in the standing wave ratio because the impedance of the cable changes. So it's not exactly attenuation. So I can try, and I've done with these variable attenuators, I've modeled it, and sure enough, you get this behavior latency doesn't really change and loss rate increases. But the question is, is it the same with water? I'm not sure. I need to run more experiments with actually injecting water and stuff. Have you approached any last phase for this study? Yeah, so... There'd be another way of getting ground truth, possibly. Yeah, so this is... It's dangerous, right? I love to work with ISPs. I know that ISPs will never let me release any aspect of their data to anyone, and to me, that's my feeling. So, I mean, you say, what would you do with the data for me with an ISP back, right? I look at that and go, I can't magically make that part of the access network better. But what I can do is I can put a hijack in there. I'll be honest, we're aware of weather issues in your area. We know about it. Stay in the load if you'd like to talk to us or hang out. Because the spike in calls you get us as weather events come through. I've seen that data, and they correlate really strongly. Yeah, 100%. And I mean, one thing we can do is provide, this is the public service to say, guys, don't call your ISP, the source is the weather. If you want to solve this problem, you've got to figure out how to get better cables to your home, essentially. But that would be my approach to what to do with it. But you're right. Collaborate with it. Again, I don't know about collaborating. What ISP do you refer to, can you say? You know, I work for one in Australia. I work for a company that serves, two companies that serve the service providers here, the big guys. And I'm now working for a company that does equipment for a wild-sized piece. What is known for a whisper? Okay, well, what is the chances that any of them would really serve data? They're pretty guarded on it. But, I mean, let's talk afterwards. I think it's something from a wild-sized speed perspective that you're probably thinking about. I may, you know. Wild-sized speeds are smaller. They're more likely to be open. AT&T, it'd be very tough. The wisps, for sure. I've actually already started collaborating with them to validate this data. Because they're just people that work at farms or wherever. They're like, yeah, I put an access point in my grain silo. I don't care if you get my data. But as scientists, I'm interested again in your guys' thoughts. What if we use data that we can't have other people validate our results? I'm just saying, as a scientist, I kind of don't trust myself to do it totally 100% right. Then say, yeah, this is it. Because I know that we make mistakes for things. Yeah. Go ahead. So you have this picture at the beginning with the CNPS and then you're going to the different aggregation levels. And I'm wondering whether you did any study where you could see what piece of the equipment fades. Because let's say that you have a small area and you know who is at the same neighborhood, how these things are aggregated. See whether these are colorated on a specific point. Yeah. Again, a controlled study where we have more data. I haven't done it yet. But I want to figure out how to do it. But I think that has an even higher barrier than collecting. This is Christophe. What was that? He's Christophe. How are you going to step back? My question is, if I watch closely enough to see whether you actually made any comparison between media that are trying not to unfold and losing it on the ground because you compare coaxial cable and fiber, sure, I thought that's not what it's going to be. Yeah, I mean, what we've started doing is comparing to Europe, where we're comparing the same media types to deployments in Europe. And generally, in Europe, actually, things are undergrounded. And preliminary results indicate that you still have similar relationships. But that's how we would do that study. Because it's very difficult for us to get ground truth on what's undergrounded and what's not, except for to go and say, Europe, generally, things are undergrounded. But it seems like wind would not affect the underground stuff. Yeah, so wind should not. You're right. And I'm talking more on rain and precipitation-type behavior. It turns out that because these things are underground, a lot of them are air-conditioned cabinets. There's still a lot of reasons for water control or whatever it is. Let's just call it a coaxial cable. Yeah. Because basically, what you're comparing is whether coaxial cables on poles break more readily than fiber cables, right? Not the intensive quality of the media. Well, that is an intrinsic quality of the media, I guess, right? That's an important point. But I mean, at the same time, we may ask ourselves, for instance, if we deploy fibers, that's the end of these problems. And at least for something like wind, it doesn't seem like that's likely to be the case. But for other precipitation, or for different types of weather, it actually seems like, again, fiber has the lowest failure rates out of all of them across the board. So we may actually be improving by deploying fiber. I'm sure that distance may play a role, right? Distance between power, the CPU, and the data. For sure. But again, that's why, again, each deployment is different. So I wanted to look at deployments across the board to get some kind of information. So I think I've misunderstood the way you presented some of your data. Sure. For instance, when you did the graph of wind speed versus how to choose, did you make any attempt to, for instance, just isolate that to wind speed? So for instance, I can imagine that increased wind speed would be really well correlated with rainstorms, which would be a bad effect. Like isolating a particular effect versus the other, if I said we're going all at the same time. Yeah, I believe for the wind speed, for all the continuous variables, we made sure that that was the only effect going on at the time. Because you're right. I mean, a thunderstorm could be highly correlated with it. There's none in your program, basically. Yeah, all right. So I mean, you're right. And that is the source of it. And again, I could double check and count my data to make sure that that's correct. But as far as I remember, we were isolating those two. Because for sure, it could be a sort of mixing of variables. But even temperature, in the summer, maybe there were rainstorms or something like that. For sure. But again, I think because another thing to take into account, let's say that wasn't actually what was done in this, is that when we look at the relationship between rain and failure rate and the relationship between heat and failure rate, and they look different, intuitively, to me, that implies that they're not necessarily related to the next. Something that I thought should have been related, but the data didn't show it at all, was sort of failure of DSL and dialogue. Because they share the same copper wire infrastructure. In fact, they're kind of the same. Like, if you're too far from the central office, you can just use dialogue. But if you're close enough, you can use DSL. Or if you're cheap, you can get dialogue on the same way. So I would give my, here's my hypothesis on this one. That basically they don't have DSL when they have dialogue. And so essentially, you have a DSL and deployed in your neighborhood in an area where you have, or even a DSL with the provider, in an area where they have that kind of coverage. The dialogue providers that are on there from those websites that I've looked at are all really remote areas. So they aren't the typical dialogue. So basically what I'm finding is that the copper is either probably older or deployed differently in those areas where people are still using dialogue, especially because I isolated only providers that only serve dialogue. And generally that's, aside from the bigger ones, which I actually don't include in this data, the smaller, local ones, where it's like your friend that has a motor bank, then those are actually really rural areas. So I think it would be different what copper is deployed there, how old is it? So in 20 years, it's going to be the same. How old is that? In Palo Alto, it should be the same. In Palo Alto, it should be the same. But I don't have any dialogue providers that are in Palo Alto. How many users would they be in Palo Alto? Yeah. That'd be awesome. They'd probably be a lot. They'd be a lot. I'd be very surprised. I think unless you're talking machines like ATMs that develop with dialogue connection, there's no reason for a reason. I'd be no surprise to find that, why didn't it like, net zero still has like hundreds of thousands of dialogue customers. AOL still has customers. So expectors are also back if there is no dialogue, we don't have a forum connection, right? Wait, if they have dialogue. If they don't have a dialogue. Yes. Maybe we don't have a forum connection. Sure, like the satellite folks, definitely that's all they have. Dial-up, it's corporate. So if they don't have dialogue. If they don't have a forum. That's probably the case, yes. That's important also. I'm just curious. Could they come back and call? Oh, they complain about it if you're broken. Could you talk about the reportable case, maybe they don't have a, you know, satellite. A satellite that you could potentially add. Let's do one more question and then we can go. Sure. Actually I had two somewhat unrelated questions. One was just more of a comment about, it'd be interesting to correlate the failure rates of internet with also failure rates of things going over the same pipes like phone or like cable. And again, how do you get access to that? It's gonna be tough to do them now because they are shared in so many instances. Like the triple play plans are kind of the ISP's new way or one of the ways of making a lot of money. So a phone number doesn't necessarily mean you're going over copper phone lines. Oh, but you're saying you want to correlate with that? Yeah, yeah, for that reason. Cool, yeah, yeah, I like that idea. That's a nice idea. And the second point was sort of related to. I used to be able to call people, which could be a problem, but. Telemarketing is good. Yeah, sure. The second part was sort of related to pair arms, which was when you're collecting wind data, do you actually get no forecasts for winds or how do you know when to start probing for wind? So there are these weather alerts that come out for high wind times, and that's mostly in coastal areas. It's for voting, but I also, I'm not quite sure why they release them everywhere, but they do, I mean it's. One thing I was gonna suggest is that they do have winds law forecasts, scenario forecasts that do predict winds. For high winds forecast, yeah, yeah, for sure. Yeah, and that though, I'm not sure what altitudes they're covering, but it's too high for ground stuff. Will we go surface to, surface? Yeah, there could be another room, but I'm gonna go that side. Cool. Thanks, all of you. But now, we're gonna, you're gonna stop. You're doing great. You're doing great. We're doing great. We're doing great. We're doing great. Thank you guys. Thank you very much. Thank you. Did we always talk about, do we all have the same problem, do we all have the same problem, Yeah.