 Good morning. I'm very excited to be here today. I am presenting the work of an incredibly talented team spread across Google technical infrastructure and the Google Cloud Platform. I'm privileged to be part of this team, and I'm also privileged to be presenting to you on their behalf today. So I'm going to split this talk into two parts. The first part of the talk, I'm going to snapshot and reflect on where we are with networking at Google. A little bit of a retrospective on software defined networking. And I'm going to use that as the launching points to talk about some of the challenges that I see for networking moving forward. To me, the question of whether software defined networking is a good idea or not is closed. Software defined networking is how we do networking. So now, what's next? OK, first, a little bit of background on Google's network. It is certainly much more than a collection of data centers. So this map depicts our presence across the internet. It's worth noting that 25% of internet traffic originates from Google today across all of our services. That's one out of four bytes delivered to enterprise, delivered to end users. We do so not only from our data centers and actually principally from about 100 points of presence spread across the world. This is where we peer with and partner with internet service providers. It's also, as you see in yellow in this picture, delivers from Google's global cash, edge notes that are actually deployed within ISPs for the best experience to our users across the world. And of course, it consists of a large fiber network depicted in blue interconnecting our network together across the planet. Over the last few years, we have gone headfirst fully into cloud through the Google Cloud platform. And this has expanded our network in new and exciting ways. So what we see layered on top of all our deployments is actually our sites for new and existing cloud regions. So what you see in green are the current cloud regions spread across the world, and new ones that have recently been announced and that are going to be coming up over the coming months. The cloud has been a really exciting opportunity for us in networking. So we've had to build one of the largest networks in the world for more than a decade to support Google internal services, whether that's web search, Gmail, YouTube, or whatnot. But with the move to cloud, we're now hosting, of course, the world's external services as well. And this is really pushing us into a step function of functionality and capability. So our architecture is really built around desegregation. And this pushes our requirements for networking within the data center. What I mean by desegregation is that storage and compute is spread across the entire building. And we're not thinking about which server holds a particular piece of data. We're replicating the data across the entire data center. And what that means is that the bandwidth requirements and the latency requirements for accessing anything anywhere have just jumped substantially. We're pushed within the campus. We have many buildings in one campus, each hosting multiple clusters. The campus is the unit of replication for our services. No matter how hard anyone tries, you can only get so much availability out of a single cluster. It's a fault zone in the end. We have highly available clusters. But if you want to get to the next level of availability for your global users, you must replicate across buildings. That's just sort of an inherent law. What that means is that you have to replicate content between buildings in real time. If an update is applied to your service in one place, it must be reflected in real time, fully durable, fully transactional to another building. This pushes the campus bandwidth needs, again, what we're seeing as a 10x step function. Similarly, between our campuses, across the wide area network, the availability of turnkey video distribution, the rise of the internet of things, if you will. But really, what this also means is that just many sensors connected to the network, cameras, real time video streams coming in, analysis going out, is pushing our WAN bandwidth needs by 10x. What we realized more than a decade ago is that we couldn't buy the network that we needed to fulfill our needs, and that is even more true today. We do need disruptions, really, in bandwidth, latency, availability, and predictability. I want to emphasize this last point of predictability. For humans, a predictable network is really important. But for good or bad, humans are much more tolerant to variability in network performance. If your web page takes a little bit longer to load every once in a while, you're probably going to be willing to live with it. Computers are much less forgiving. They need predictability, and they're going to actually be built for the worst case rather than the common case. So making the network much more predictable is critical. So the history of modern network architecture, I think, is going to begin at ONS. In other words, this community has been responsible for defining what networking looks like in the modern age. And it's really different from what it has been. At Google, we've been excited to be able to tell part of that story. So let me, again, snapshot on some of the things that we've been doing in, well, let's say, broadly networking, but software-defined networking, the pillars of SDN at Google. And actually, all of these things have been described here at ONS over the years. In 2013, we presented before our wide area network interconnect for our data centers. In 2014, we described Andromeda. This is our network function virtualization stack, our network virtualization stack that forms the basis of Google Cloud. And finally, in 2015, we described Jupyter, our data center network. Let me spend just a few seconds on each one of these. I'm not going to go into any of them in any detail just because I want to get to some of the newer material. So B4 is Google's software-defined WAN. And this picture depicts our data centers spread across the world and the interconnectivity of those data centers through B4. This is built entirely on white boxes with our software controlling it. At the time, we believed it was the world's largest private network, likely is one of, if not the biggest today, and it continues to grow at an unprecedented rate. So when we revealed it and built it, actually, our goal was to build a copy network, what we referred to as a copy network internally. What I mean was lots of free-ish, cheap, at least, bandwidth that might not be so reliable. So you could use it opportunistically. As it's grown, it's really become mission critical. It's really transformed to being something that everyone counts on for the highest levels of availability and the highest levels of bandwidth. So this graph here shows the growth of traffic across B4 over time until about the end of 2016. And a very interesting thing for us is that B4 actually continues to grow faster and carry more traffic than our public network. And again, our public network carries about 25% of internet traffic. So this should give you a sense for the demands for computer-to-computer communication across the internet through our data centers. Andromeda is our network function virtualization stack. And basically, the way to look at this is how do we support multi-tenancy within our internal cloud such that external customers can run on this platform? How do you give the view of one dedicated private network for all of our customers, allowing them to configure it, expand it, shrink it, replicate it across multiple buildings or even multiple regions interactively? And finally, Jupiter is our software defined approach to data center networking. It reflects the culmination of about 10 years of work in multiple generations of different internal solutions that we built to support our data centers. As of 2013, our Jupiter network supports more than a petabit per second of bandwidth. That's sort of bisection bandwidth among all the servers, about 100,000 servers, up to 100,000 servers within a data center. And it's continued to grow. So these are the three pillars of SDN at Google. And of course, we were very proud of the work. But one of the pieces of feedback that I always got from folks outside of Google was, this is great, but it only works for internal applications. You have your walled gardens with your data center solutions, your private WAN, your cloud solutions. But the real challenge, and I do agree with this, the real challenge is, how do you bring software defined networking to the public internet? So we realize this challenge. And we realize that our network would only be as strong as its weakest link. In other words, if we are to deliver a highly available, predictable, high performance experience for our customers end to end, we need to be able to extend the capability of our network, not just from our data centers, between our data centers, but all the way to the public internet. So today, I'm going to be telling you just a little bit about Espresso, which is our SDN approach for the public internet. Espresso has been in production at Google for two years. It's carrying a substantial fraction of our traffic transparently. So in other words, if you're using Google, there's a pretty good chance that Espresso is responsible for carrying some of that traffic today. Let me put Espresso in context for you. So I described our Jupyter data centers that are interconnected within Google by our B4 network. We have a second network called B2. This network connects our data centers to our peering metros. Recall, I told you about the 100 points of presence, the 100 peering locations we have across the world, where we peer with our partners. This is our B2 network, our public network. And of course, we have these peering metros that contain both routers, but also a fair number of servers, a fair amount of storage. We do run some computation there. We serve cached content for YouTube and other large object distribution sites at the edge. And of course, around this is the internet, where our users interact with us through the peering metros. What we've now done is we've taken our approach to software-defined networking all the way to the peering edge with Espresso. So why? The before and after. Previous to Espresso, we did networking just like essentially everybody else did. We ran protocols on high-end routers and peered with our partners. But these routing protocols running inside individual routers had a very local view. They had to have a local view. This router is going to peer with that router. And it's going to build up its view of connectivity for the network. And internet protocols, in the end, are optimized for connectivity first. In other words, if you go back to the history and the genesis of internet protocols, the goal of routing protocols was to find a path between source and destination. Once you find a path, you're done. Success. The goal of internet routing protocols is not to find the best path. And it's certainly not to dynamically shift the padding in real time, depending on what's happening in the network. That doesn't scale, at least the way that the protocols are typically built. Furthermore, internet protocols have relatively coarse fault recovery. If there is a failure in the network, you're going to require multiple rounds of exchange, pairwise exchange, among different routers spread across the internet before, and easily minutes can pass before the information spreads across the network. And these routers can come to convergence, come to agreement as to some alternate path that they might follow in getting to a destination. With Espresso SDN Peering, what we are able to do is basically break out of the box. We can have a view of what's happening in an entire metro, an entire peering location, across many routers, across many servers. We can further take a global view across all of our metros. We can use application signals with respect to how well our applications are actually delivering content to end users spread across the world, to update how we perform routing in real time. And of course, we can then push this information in real time. This is going to be too fast, so I apologize for that in advance. But let me just give you a snapshot of how we built this. So at the bottom, we have our external peers. These external peers are running BGP, just as they always had. But what we've done is we've replaced the high-end, complex routers that we typically have at the edge of our network with a label-switched fabric. You'll see a little bit more about what that label-switched fabric means in a moment. Furthermore, we've broken BGP out of the box. We've implemented our own fully compliant Internet Standard BGP stack. And we now run this BGP stack on servers, co-located with this label-switched fabric at the edge of our network. And this is all within the Metro. We have servers within the Metro. They are terminating TCP connections. They are serving up video content. They're doing compute. We have a packet processor running on every one of these hosts. We've been able to leverage the same basis, at least, that we use for our Andromeda network virtualization stack to also do high-speed, line-rate packet processing at the edge of our network. Basically, what these packet processors do is they insert a label onto every packet. And this label tells the label-switched fabric which port to use to egress a packet. So now, what have we done? We've removed the need for an internet-scale fib. I basically think of it as a really, really big forwarding table that doesn't fit on most commodity chips. The routers no longer need to know how to forward data. They need to read a label that's inserted by the hosts. Hosts have lots of memory, cheap memory. DRAM is much, much cheaper than what you get on your routers. OK, and now this, of course, is all hooked into a control plane. We have a local controller within every metro. This local controller is simply programming the fabric, the label-switched fabric saying, for this label, egress from this port. The servers are sending summaries of how the flows are behaving in real time to central control. Basically, across all of our connectivity to users across the world, how are applications behaving when I use this particular egress path? So the local controller can be making updates in real time. If there's a failure, it can shift things. It might have backup pads already available in some other router. Similarly, the global controller can be integrating across all the metro views. So remember, 100 metros spread across the planet, all feeding information in real time to this global controller, which can react. If path A happens to be better than path B in real time, imagine that there's a traffic jam. There's congestion. We can adjust. So in real time, what we're trying to do is determine what's the best way to deliver the best quality of experience to our end users based on application-specific signals. And this is critical. A router can't know how well an application is behaving as packets fly by in real time. The application knows. It has those signals. It provides them to the controller, which will then make decisions and then feed it back down to the routers for adjustment. And so this has been really transformative for us. And it has been challenging, because now what we're doing is we're saying we're taking it out of fully our control and we're actually putting the traffic that is a real time to users across the world on this path. OK, so this gives you hopefully a snapshot of where we've been, what we've been doing. And as I said, for me, I consider the book more or less closed on SDN, starting from our data centers to the WAN, to network virtualization, and now all the way to the peering edge of the network. So what's next? And of course, SDN is going to be playing a substantial role in it, but what are going to be the drivers moving forward? I won't have time to describe most of this, so I'm going to be focusing on just some of the cloud aspects and the implications. I'll start with serverless compute in Cloud 3.0. So where is cloud going? And what role does networking play in this context? Again, a bit of history. So with what I'm going to refer to here as Cloud 1.0, this was, think of it, circa 2000, maybe 15 years ago or so. What we had is that virtualization allowed enterprises to consolidate their servers. They could run multiple workloads on one machine. You could have eight different operating systems configured just the way you wanted on one physical server, rather than have eight physical servers configured just the way you needed for your eight different applications. And this was a big deal. It really changed how enterprises could leverage the data centers. What we're in the middle of right now is what I'm referring to as Cloud 2.0. We now have the capability of delivering hardware on demand configured just the way that you want, but now you can scale up and down dynamically. So this public cloud, this move to the public cloud, has really freed enterprises from building private hardware infrastructure. And building scalable, efficient hardware infrastructure is actually really hard. Doing it efficiently is a challenge and one that we're now able to leverage by basically scaling up and down our hardware needs. However, we're still thinking in terms of hardware boxes, even if they're pseudo-hardware boxes that happen to be running in the cloud. Running our network, running our infrastructure, hasn't gotten any easier. We still have to manage everything. It just happens to be in somebody else's data center. So we're still thinking in terms of scheduling, load balancing, et cetera. What we need to be doing, and what we will be doing moving forward, is moving to Cloud 3.0. And here the emphasis is on compute, not on servers. And this is what I mean by serverless compute. What we really should be enabling people to do is focus on their core competency, which is to specify their business logic and to specify their data, how their data gets updated, how their data gets leveraged. So we want to enable a move to real-time intelligence, machine learning, and serverless compute, not on where your data is placed, how you load balance among the different components in your system, how you're going to configure the operating systems on your virtual machines, how you're going to patch them, et cetera. And what I would argue, certainly as we're looking to the next decade and hopefully sooner, is that we need to be aiming for Cloud 3.0. In other words, the rush to Cloud 2.0 has been take everything that I understand in my own private enterprise and move it to the cloud. And this is important, and a lot of great work has gone into this. But I would say that this is a waypoint and not the final destination for virtualization. Networking is going to play a huge role in Cloud 3.0. So let me highlight some of the bits that I see here. Storage disaggregation is going to be critical. And as I alluded to earlier, the data center as a whole is going to be your storage appliance. Furthermore, multiple data centers are going to be the basis for your high availability design. Seamlessly, you're going to be able to replicate and transact on your data with the highest levels of availability and, of course, the highest levels of durability. Think about the networking requirements to really make any disk and maybe much, much tougher any flash device in multiple data centers appear as if it were local. How do you deliver that kind of predictability on one network at scale? Seamless telemetry, this is basically the generalization of load balancing, scale up and scale down. Again, how do you specify what your performance requirements are, what your availability requirements are, and have your compute scale up and down transparently underneath the hoods? The application signals and the spin-up, spin-down of compute infrastructure is going to have to be sub-second. It can't be operating at the granularity of even multiple seconds, certainly not minutes and hours. We found that transparent live migration is key. People don't want their servers to go down. That was true before cloud for Google, and it's even more true now. You can't say, I'm going to be taking this server down once a week for maintenance. What we've been able to do is move virtual machines transparently, hitlessly, from one place to another, even orchestrating moves from one data center to another data center without customers being able to notice. Once again, the networking requirements we're doing so, just think about all the states associated with all these virtual machines, and then moving them in a manner that no one notices. This remains at least partially an open question. Finally, we need to be able to support an open marketplace. If you're going to specify compute, you're going to have third parties providing supporting services to you. How do you plug that in securely, transparently, into your network, while obeying your higher-level policies? So some of the main things I want to emphasize here are we need to move to policy, not middle boxes. Today, we're still specifying how packets flow through individual middle boxes to implement NFE functionality. What you really want to be able to say is, here's the NFE functionality, firewalls, load balancing, whatever that I need, and then have the network configure itself to deliver that for you. Similarly, we want to move toward SLOs. Service-level objectives here mean here is the response time that I would like my application to have. Here's the level of availability that I need to deliver for my application. And then underneath the hoods, the system can figure out how much replication we should have both within a data center and across data centers, potentially across the planet, to meet your latency needs for a global user population, to meet your availability needs, and to meet, of course, your compute demand, depending on real-time signals for popularity of your services. So over the next decade, again, the network is going to be central for defining what compute means. It's going to enable next-generation compute infrastructure. It must in that we're going to certainly within individual racks, but I would say at even larger granularity, be moving to new computing models. So in other words, the computer is going to break out of the confines of an individual server. The network will define next-generation storage infrastructure. So once again, how do we move to a world where we don't think about local disk, local flash, and we think about storage that is available to us as if it were local, spread across the data center? And hopefully what I've been emphasizing here is that the right network infrastructure can deliver fundamental new capability. So to emphasize some of the points made earlier, it's not only about cost. It's not about bandwidth. It's not about latency. It's about new capability, building applications that you wouldn't imagine building otherwise. OK, I only have a little bit of time, so I'll flash up two things. One is how we internally at Google prioritize our infrastructure work. And what I want to say is that everything starts with availability. If you don't have an available infrastructure, you really don't have any purchase to do anything else besides that. Then comes manageability. Once you have a manageable infrastructure, you can look at velocity. I'll talk about that shortly. With velocity, you can look at reducing your stranding. And finally, when you have all that, you can get to performance. And I know that there's a lot of strong academic people in the audience. I also want to emphasize this pyramid in terms of how to prioritize some of the research work. I'm going to skip to one last bit, which is velocity, which I think is just super important. I define velocity to be a speed of iteration. For us to get to where we need to go, we have to be able to move fast. What I mean here is not have a fast network, but have a network that can be evolved on a weekly basis. What we strive for and what we demand at Google is we can upgrade our network with new functionality, new features, again, without anyone noticing. So the availability remembers the base of the pyramid every week. If you can only iterate every month, every three months, every six months, we heard about five to seven-year refresh rates from Martine earlier, you're never going to get to where you need to go. And the network will always be the bottleneck in enabling innovation. So how do you get to a place where you architect the system from the ground up, from the hardware to the network infrastructure to the applications running on top of it, such that you build for velocity? Every week, you get a new software release and no one notices. There is no downtime. There is no maintenance window. You don't tell people at 11.43 PM, I'm going to be upgrading the network. Please stop running your applications. It gets upgraded. And people then benefit from the new functionality that you've delivered. So what we really follow here is launch and iterate. How do you get your service out there with minimum viable functionality, still hopefully powerful? And then every week, make it better. OK, so I'm going to skip the rest of the presentation and stop for any questions. Fantastic. I mean, Michael Howard from IHS Market. You must have considered the stateless segment routing but you've chosen to put state nearby and available on servers and somehow build your header from that. What were your considerations? Great question. And we are a big fan of segment routing. And certainly, we will leverage it. I think the main consideration for us is, again, iteration speed and velocity. So segment routing to first order isn't available. So how do we launch our functionality without necessary? I mean, so again, to make sure I'm clear, I'm a big fan of segment routing, but I don't want to be dependent on a piece of software and hardware becoming available commercially before I can move forward with my functionality. So the main consideration was, how do we get those in as possible? I'm going to ask a question here maybe a little bit away from the focus. But concerning your advances in cloud and information caching and information hosting all over the world and you have the biggest contents provider, have you guys thought about what are you going to do next for information-centric networks, ICN and content-based networks? Do you have any projects that are promoting this kind of things, especially if you have open source projects that people can collaborate with you or you can offer some kind of toolkits for folks? Because I think that's going to be also one of the next challenges for cloud networking. Thank you. Yeah, it's a good question. I think that broadly speaking, information-centric networking is something that we're engaged in, especially if you think about the implications of Cloud 3.0. So in other words, really here we're talking about named data that's transparently available to you, regardless of where you happen to be running your compute. So I'm certainly happy to talk more about that. I do agree that this compute model is an interesting one. OK, we heard about 5G, which is coming pretty soon. Millions, billion of device. 5G is a pure IPv6 environment. When IPv6 is coming from Google? Yes, so certainly we have support for IPv6 endpoints and have had it for a long time. We've had a number of internal projects that are spreading IPv6 across our network. So depending on the particular context that you're asking your question, either IPv6 is already at Google or is coming, certainly, for some cloud settings shortly. You can have IPv6 endpoints today without any issue. All right, I think we're wrapping it up. Yeah, thank you very much. Thank you.