 Good afternoon all, and welcome to the exciting session on service mesh use cases for telco and Edge. I'm Kunal Shukla, key account executive in telco, media, and entertainment vertical at Google. Today, I'm going to talk to you about some of the key industry trends, drivers, opportunities, and challenges within the telco and Edge industry. Let's start with a closer look at key industry trends that are driving the Edge and telco model. I have summarized them in three areas. Our lines are changing the way we work, the way we live, the way we play. We live in a world of connected everything. Erikson predicts 25 billion connected devices by 2025, and these devices will generate huge amounts of data. Gartner predicts 75% of the data generated will be processed near the source of where it is generated. This has led to a strong focus around Edge initiators and Edge as a business service platform to deliver new business outcomes and solutions for enterprises. Next is telco transformation. The telco industry has gone through tremendous transformation in the last five years to build the foundation of next generation networks for consumers and enterprises. Telcos, for the longest time, have relied on network equipment providers to deliver an integrated proprietary stack solution driving vendor lock-in and high operating cost. In the last three to five years, telcos have embraced technology disruptions through cloud, SDN, network function virtualization, and CI CD processors leading to an open architecture, optimized cost structure, and reduced time to market. Although they're not past the goal line yet, they have run into multiple challenges within the ecosystem which they're trying to resolve. Third is 5G. 5G is not just another G. Historically, 2G, 3G, 4G has focused on wireless technologies, spectrum utilization, and getting bandwidth to consumers like you and me. 5G bring forward the paradigm shift and promises to deliver service-based architecture focused on end user experience and services through concepts such as network slicing, better control, management. Overall, these are the building blocks transforming the industry and enabling enterprises to drive new monetization opportunities. Let's look at what are the drivers for edge and 5G. I have summarized them in four main areas. Number one is the need for latency-sensitive solution. The ability to take actions based on data generated by connected devices in milliseconds rather than seconds. Next is around privacy and security, keeping the enterprise data generated within the enterprise environment based on policies and not sharing it across clouds or across networks. Number three is rightsizing the bandwidth. Network costs are in general a big concern for enterprises, and they are looking at effective ways of managing the network cost and prioritizing the right traffic. Overall, it is about driving new business outcomes and values, whether it is in terms of consumer experiences, new products for monetization, or operational efficiencies. So let's talk about where is the edge? Let's look at this picture from right to left side. Today, typically the data is generated on the devices, but it's stored in the data center or public cloud, which are tens of regions of public cloud or data centers. With the drivers discussed in the earlier slide, the industry is seeing a left shift in moving cloud environments towards the edges. At Google, we already are extending our good GCP, Google Cloud, into our pops, but are extending our partnership with Telco to bring our cloud environments to the Telco edges, which are in thousands. In addition, we are extending this cloud environment to the customer edges, which are either branch offices, warehouse, retail stores, campuses, in which you will drive the real low latency next generation business outcomes on-premise within the enterprise environment. The main goal is to run the right workload at the right cloud location to drive the right business outcomes. We see a role of all distributed cloud locations, whether it be at the edge or the regions. The main goal is to provide enterprises a business services platform for enterprises and developers to build consumer and business solutions. So let's take a look at how enterprises can use Edge and 5G. 5G and Edge together provides a horizontal platform that enables vertical industries to build, monetize, and transform their business and consumer outcomes. Strong focus has been put with the industries such as retail, manufacturing, healthcare to take this base platform and bring in new technologies to drive new services and solutions. For example, retail is looking at Edge and 5G to drive consumer experiences, such as queue management, heat maps for product placement, dress your look using AR and VR, and no contact checkouts. In addition, they are looking at self-cleaning and self-storing robots. For manufacturing, manufacturing is looking at driving operational efficiencies in connected factory by reducing the faults found using AI and ML-based solutions, security surveillance, asset management tracking. Healthcare also has seen a similar interest and promise around pharmacies of the future and transforming the healthcare industry, especially during today's environment of remote patients and remote doctors. So let's walk through an example of customer Edge on what we are trying to do from an end-to-end service using 5G and Edge. In this case, we have extended our cloud environment to customers Edge, that is warehouse of the enterprise. Telcos are providing 5G connectivity that is dedicated to the enterprise customers to which all the devices in the enterprise locations are connected. In this example, all traffic coming from video cameras in the warehouse location would be kept on the enterprise location based on enterprise policies. Once the camera feed is kept local on the enterprises, we are providing Google's Edge application stack, which is a Kubernetes-based managed platform to deliver infrastructure as a service, platform as a service, or software as a service environment to enterprises. In the specific case, the application is video intelligence application that is processing the video camera's RTSP feed to deliver tangible business outcomes, such as counting people within the warehouse during COVID times, providing surveillance, security, and asset management. The AI and ML model for the video intelligence are trained in Google Cloud, but the execution and the actions are taken at the Edge within the enterprise application and within the Edge application stack. The same model can be extended to multiple other use cases, whether it be manufacturing, healthcare, venues, stadiums. But the idea would be to host multiple applications to deliver new business outcomes and experiences for enterprises. Now let's take a look at 5G and network. Everywhere you go these days, you're gonna hear about 5G. Even my kids are talking about new 5G iPhone. So let's uncover how 5G is transforming the telco architecture under the hood. I classify that in four main areas to understand this better. Number one is multi-vendor cloud-native network function applications. So if you look at the bottom side of your slide where you see RUs, DUs, and CUs, 5G telcos are embracing cloud-native architecture. I will not go into the alphabet soup of network functions, but they are essentially cloud-native applications that makes your call happens on a daily basis. You being able to browse the high-speed data on your phone or home, these are the applications which make it happen. So number one is a multi-vendor cloud-native network function environment. Number two is service-based architecture. 5G networks leverage service-based architecture with the applications built using microservice methodology. With the microservices-based architecture in 5G, it will ultimately evolve into a complete service mesh with service discovery, load balancing, encryption, authentication, employing sidecar for inter-service communication. Taking the cloud-native and microservices-based application, this enables the telco to start distributing in a cloud environment, where if you look at the radio edges, the 5G radio site, that becomes a cloud environment on which radio functions rise. You can have 5G Edge, where you can have your data plane application and other applications residing. You have your 5G core, where you will have rest of the control plane applications and data management application. And you will have the public cloud where you're bringing in new differentiated applications on the 5G network. Essentially, this will enable a distributed cloud deployment model. The last thing is dynamic network slicing. 5G introduces a concept of network slicing, by which telcos can provide you a dedicated network slice based on different traffic types. You could be an enterprise asking for a network slice, you could be a manufacturing entity, you could be connected cars, or you could be an IoT service provider. Each of these solutions can be provided a network slice based on the network, based on the traffic type. Overall, 5G telco network is more than just speed. It's about how you enable consumers and enterprise experience. Enable machine to machine communication at scale, deliver ultra reliable low latency use cases, such as autonomous vehicles, drones, AR, VR, which will truly transform the consumer and businesses. While the promise of Edge and 5G with telcos is appealing, there are multiple challenges that need to be addressed in order to truly realize this at scale. Let's talk about that. The first one is, as I spoke, it's distributed cloud management, moving from tens of data centers to tens of thousands of cloud environments. How do you drive the distributed cloud environment and manage it at scale? The second is the focus of enterprise and consumer services. The resources needs to be scaled dynamically without any business impact. So that's where dynamic service scalability and performance comes in. The third area is around service definition and orchestration across VM and containers in a distributed cloud environment. Applications will be running on VMs and containers. How do you manage them both? And using that, how do you provide end-to-end SLAs and availability across the business critical systems residing in these distributed cloud environments? The next one is with network slicing, ability to imperatively deploy, manage, measure, and this network slices at service level becomes extremely important. And the last part is around harmonization of service control, service mesh, across multi-vendor application and network functions. So how do you provide that multi-vendor service control? When I look at these challenges, service mesh is a critical paradigm which can solve many of these challenges. I would like to invite my colleague, Prajakta, to do a deep dive into the service mesh for telco and edge use cases and how we address these challenges. Prajakta. Thank you, Kanal. Hello, everyone. I'm Prajakta. I'm the Product Lead for Cloud Networking, telco and edge in Google Cloud. As Kanal described, service mesh is a key paradigm for solving many challenges for cloud, telco and edge use cases. As discussed, as an industry, you have three big opportunities to tap into for telco and edge. First is working with telcos to transform their IT network and services. This is going to be fueled by leveraging the best of technologies, experience and talent that telcos, public cloud providers, ISVs and others have with all of us becoming better in the process. The second major opportunity is around delivering value to enterprises with edge computing solutions and 5G as a business services platform. This will create new revenue streams for the broader ecosystem. The third one is as much an opportunity as a necessity to enable the first two opportunities. It's about breaking down the silos that have existed between the network and services platforms, paradigms, service orchestration, automation and more used by telcos and public cloud providers. It's also about jointly building pieces that don't exist yet. That's how we can unlock seamless delivery of services across a global distributed edge and cloud. Now to do this, we need open, scalable, resilient network and services platforms that can work across telco, cloud and edge. Most of you would have already guessed the three cloud native technologies that are important for this platform, containers, service mesh and serverless. Now here's how I generally explain service mesh to networking and telco folks. Think a few years back, you had all of these complex network appliances and then came software-defined networking or SDN and it disaggregated these closed complex appliances into a control plane and a data plane. The data planes were simple and forwarded traffic. These data planes were controlled by sophisticated, often logically centralized control plane. Think of service mesh as software-defined networking for services. You take a complex application, you remove all networking code from it. So you extract this out and in one model, you move it to a sidecar service proxy. Now you need a way to configure, control and apply policy to the proxies in the data plane. And that's why you have a service mesh control plane. One of the biggest benefits of service mesh is that it decouples development from operations, which means developers no longer need to write and maintain policies and networking code inside their applications. Service mesh does not make any assumptions about where your service is or whether it is instantiated on VMs or containers or bare metal. That's why it provides a framework that can be used consistently across multi-cloud deployments, across heterogeneous VM and container environments and for telco and edge. What are the benefits of service mesh? So first of all is to simplify networking architecture and deploy advanced traffic management capabilities easily. The second is to enhance security and encrypt all data in transit by automatically using MTLS on all calls within the mesh, use policy to allow only explicitly permitted calls. Third is to ensure more uptime, safer rollouts and lower time to resolution with logs, telemetry and traces gathered for every service in the mesh. Let's take a closer look at the service mesh. Basically a mesh abstracts the notion of a distributed system network from the application because the individual services are not aware of the network at large. They only know about their local proxy in this model. And each proxy is configured and administered separately. Service mesh control plane takes these onboard data planes or these whichever the service proxy is and configures them into something larger by providing unified configuration and management. And then depending on the implementation added intelligence to these service proxies. The big benefit is that a service mesh lets you reason about policies at the level of services instead of infrastructure. One of the most popular ways to implement a service mesh is using what I just referred to which is the open source Envoy proxy deployed next to the application logic and then the data plane of the Envoy is managed by a service mesh control plane. The service mesh control and data plane speak the open XDS protocol. This is also evolving and Envoy makes the data plane highly programmable and extensible and it's one of the most popular proxies in the market. There are many implementations of service mesh today. Some of these are also aiming to solve new problems with mesh constructs or to bring mesh constructs to solve layer two, three issues or provide a platform for building new service control planes or to provide an abstraction layer that can help you plug in one or more meshes under this layer. My personal favorite is Hightower service mesh you should check this Twitter link out. Jokes aside, I'm going to use Google's traffic director in my examples today but you're welcome to use your favorite mesh as long as it supports the capabilities I'm about to describe. Let's take the example Kunal described. We want to build a smart retail solution for a retail store but the solution is going to be host hosted across the retail store, telco network edge and cloud. What are the capabilities of service mesh we can leverage here? First of all, according to Gartner a very small percentage, so about 5% of enterprise apps worldwide are containerized. Similarly, telcos have significant VM based apps which means a service mesh implementation should have first class support for both VMs and containers. The next set of capabilities to leverage are the traffic control capabilities. Let's say you want to roll out a new version without worrying about ops challenges like for example, canaries or AB testing, service migration and so on and so forth. You can easily configure routing rules to split the traffic based on weight to make any of these different capabilities happen. You can also steer traffic to services based on HTTP headers. When there is a match, you can configure a variety of actions traffic splitting, redirects, URL rewrites and much more. Fault injection helps you test the resiliency of services to different forms of failures like delays, aborted requests, et cetera, by simulating various service failure scenarios like high latency, partial availability, overloads and so on. As a part of fault injection, when the client sends requests to a back end service, delays can be introduced by the service mesh control plane in this case traffic director on a percentage of requests before sending those requests to the back end service. Similarly, requests from clients to a back end service can be aborted by traffic director for a percentage of requests. One feature I personally love is traffic mirroring. This feature allows a shadow application to receive real traffic, which is processed by the main version of the app. It's fire and forget, which means responses received by the shadow service are discarded. A fire and forget traffic mirroring can be a powerful test to test binaries with production traffic. And it can also help you debug errors happening in production using the shadow service. If service C cannot reach service B for example, X consecutive tries, trip the circuit breaker and enforce a timeout on all calls to service B. It is as simple as this to turn on circuit breaking and outlier detection in the world of service mesh. And this is basically one of the areas that people don't try out as much, one of the most powerful areas of service mesh, especially with the envoy proxy. One of the key considerations in telco and edge use cases is the ability to provide end to end visibility in SLAs. In a service mesh, every service is inherently observable and it emits signals. You can easily set up metrics, emission, collection, tracing, and then build out closed loop automation to act on the gathered insights. Anthos Service Mesh, for example, from Google, has comprehensive SLI, SLO, SLA management capabilities. Many other service mesh implementations provide related capabilities as well. Service Mesh aims to make granular and service level policy-driven security and enforcement easy to deploy. The service mesh traffic can be automatically encrypted using MTLS, configurable authentication policies and secure naming information ensure authorization. You can have fine-grained role-based access control at the application layer for micro segmentation, and then you can combine service mesh security and observability to perform security auditing as well as for detecting and investigating anomalies. Service mesh enables you to specify traffic management, security, and observability policies at the service level in a compute agnostic manner. For telco and edge, one key requirement is to manage a mixed environment of VMN container services. The other use case is to migrate VMP services to container services using cap-grow-drain strategy. In the past, telcos had MPLS networks and they wanted to migrate to optical. They started doing this using a strategy called cap-grow-drain. The first time I heard this term, it was from AT&T and ONF. So you cap your MPLS networks, you grow optical, and you drain existing MPLS to optical. The same applies to transition from VM to container services. Continue to support existing deployments with VM orchestration. Introduce a container platform like Google, Clouds, and First for all new containerized deployments, and then slowly drain VM-based services to container services. In the interim period when telcos have both VM-based services and containerized services, these can coexist seamlessly and interact with each other at the service mesh layer since the service mesh is compute agnostic. Telco and Edge environments will often have legacy appliances or clients alongside cloud-native services and clients. Imagine you have a legacy client where you cannot insert an Envoy proxy. Now since load balancing in a mesh is client-side, then how do you split the traffic from legacy client to say V1 and V2 of the prediction service, as you did before? You can do this by inserting a managed middle proxy based on Envoy. You can think of this as an Envoy-based Layer 7 internal load balancer, and then you can specify traffic splitting or any policy on the L7 ILV, which you cannot perform on the legacy client. An example of this implementation is Google Clouds traffic director and Envoy-based Layer 7 internal load balancer. One of the key requirements we started seeing with a subset of telco and Edge use cases is high performance, where the penalty of a sidecar proxy was not acceptable. This was one of the reasons we built out a new flavor of services in the mesh, the proxy list service. Traffic director support for proxy list GRPC services is based on a simple idea. If traffic director can configure sidecar proxies to do load balancing on behalf of a GRPC client, why not just have it configure the GRPC client directly? To make GR proxy list GRPC possible, we continue to adhere to the open XDS API and we added this XDS API support to the most recent version of GRPC. The XDS API is the same open source API as you use for Envoy proxy, which means you get a level of consistency. For telco and Edge, we see three main use cases for the proxy list GRPC approach. Simplified GRPC adoption, high performance services in a service mesh, and bringing service mesh to environments where you cannot add a sidecar proxy. In this example, the services deployed at the telco, Edge, are proxy list GRPC services. Actually, there's one more interesting innovation to keep an eye on, which is bringing programmability to the client endpoint through Envoy mobile and GRPC web. This will help create a true end-to-end stretching from the client to the edge to the cloud, and we continue to go and do more stuff in this area. One more key use case of service mesh is cross cluster, cross region, and cross edge failover with telco and Edge use cases. Now, typically, if you did this on your own, this involves a lot of toil in setting this up. To solve for this use case, we built cross cluster, cross region overflow, and failover capabilities inherently into our service mesh through our service mesh control plane, which is traffic director. With these capabilities, you can easily deploy a service with VM or container instances in multiple regions and also with endpoints in non-GCP locations. Now, under normal conditions, in the picture that you see, traffic from Iowa will flow through instances of this service in US central and from Tokyo through those in Asia southeast for every service, which is your front end and then your cart and then your payment service. Now, if the front end service instances in the closest region to the end user have no capacity, traffic will automatically shift over to the other region. If for some reason, if any of the services, say like the payment microservice goes down, then traffic automatically shifts to the other region to avoid outage. This capability will be key to managing capacity at the edge because it's not as much capacity as say you have in the cloud and you have to be able to do these types of failovers as well as overflows. And this is also very critical to guaranteeing the SLAs in the world of telco and Edge. Now, moving on from Edge use cases to the core telco network. The goal here is to help deliver CAPEX and OPEX efficient network and make it ready for 5G by bringing in cloud native technologies, including service mesh. Service mesh is a key paradigm for 5G core service-based architecture. Service-based architecture provides a modular framework from which common applications can be deployed using components from one or more providers. So let's take it back to English speak. One of the key components to deploying 5G is the 5G packet core. This is comprised of a 5G core control plane and data plane. So what you see on the top is the control plane. The 5G control plane itself is a set of network functions like AMF and SMF that communicate with each other through well-defined interfaces. While we don't have enough time to get into the details of each service, at a high level, this control plane will with all of the alphabet and super services can be deployed as a service mesh. Think of the network functions as services and the service-based interface between them as service-to-service communication. You can also deploy these as GRPC proxy-less services which we described earlier and this is to deliver the high performance or you can also deploy them with on-voice sidecars depending on your performance requirements. You can then control these services with a service mesh control plane, bringing all of the service mesh benefits of scaling, traffic control, security observability and more to the 5G core service-based architecture. Another key benefit for telco use cases is that Envoy at its core is extensible. You can basically write your very own extensions. The Envoy architecture makes its fairly easily extensible. We have a variety of different extension types. These include access loggers, access lock filters, clusters, listener filters, network filters, HTTP filters, GRPC credential providers, health checkers. There's also more recently all of the workaround wasm and so on and so forth. So as an example, one of our partners which is Tetrate has an offering called Get Envoy which has a build pipeline accessible to do extensions building. So what this pipeline does is it allows you to cherry pick the extension that really matters to you while you go build out the Envoy build. So this extensibility of Envoy is really going to drive its adoption in telco and edge use cases. Now in addition to everything we discussed, what are some of the service mesh innovations we can partner on to solve as an industry? If you take a broader view of where we are as an industry we're just getting started and we have many things to solve together especially in the area of telco and edge. While we are making great headway on creating consistent cloud native network and services platforms and evolving them for enterprise telco and edge we need to do much more to abstract out complexity and manage heterogeneity better. For telco we need to get to the point where network services and slices automatically scale and optimize are proactively secure and effortless to administer and use. Take the effortless edge problem. Now basically the edge is distributed to harness the power of this distributed edge I as a telco or as a public cloud provider or even as an enterprise I should be able to leverage any of these edges in addition to my own based on my business and technical requirements and when I say any of these edges I'm generally referring to either putting something on an enterprise premises or in a factory or in a stadium or in a telco network edge or even Google's Edge which is our pops or in Google Cloud or which is why like it's a whole distributed global distributed edge and it's a continuum we should be able to use any of it. I should also be able to move my service from an edge it is deployed at to another edge say to bring it closer to the end user or to lower cost. I should also be able to deploy a service that spans multiple edges. So take the case of machine learning where for my services I may do small amounts of processing on the user's device which then connects up to a telco edge for scrubbing which then connects up to Google Cloud to do machine learning at the Google's Edge. I should also be able to chain services which are deployed across a variety of edges so I can leverage best of breed services from a variety of providers for my end customers or to create a slice across the continuum and we'll talk a bit more about slicing and I should be able to do all of this as easily as possible without having to know where the edges are and without needing deep technical knowledge. So we have bits and pieces that can actually help deliver a solution for this but we haven't really put together a full blown solution as an industry which solves all of this problem. So to make this happen, for example I'll describe one example of what's needed. We need a layer above the service mesh that allows users to specify desired outcomes for the end to end application or service chains if you want to go a little bit lower in abstraction or workflows if your company is workflow oriented as well as placement policies. So for example, optimize for latency or optimize for cost. And then this layer also orchestrates and connects the services and underlying infrastructure using service mesh and other open APIs. So we have a lot of work to do here but it's one of the most interesting areas to partner as an industry on. Another interesting problem to partner on is effortless network slicing at scale for telcos. So you will hear a lot of talk around network slicing very simply from the end user all the way to the end of the application. You want to create a slice at every layer whether it's the physical infrastructure all the way to the service layer where you're guaranteeing a certain set of parameters to that slice if you will that was created for that end user and that entire flow. So network slicing is actually happening in experimental setups and at small limited scale. But if you really want to monetize this infrastructure whether say let's say you're a telco or you're somebody who's monetizing network slicing we are missing a few abstraction layers and I'll just describe one such capability there more. It is the ability to treat everything as a service. The absence of this is what is making network slicing more complicated than it needs to be. So you can think of this almost as an evolution of the service mesh concept. So it is possible today to treat for example entities in a service mesh easily as services. But what about third party opaque services or legacy services that set outside the mesh? So we need to extend the notion of a mesh to be more inclusive and in fact we've started the work on extending the service abstraction across all of these and adding capabilities with what we are close working closely with telcos, ISVs, partners other folks in the industry and we look forward to collaborating with all of you as well. So service mesh is a key paradigm for solving many challenges for telco and edge. We discussed a few of the use cases today including traffic control, pervasive security and observability, a cross region failover and overflow, multi cluster, multi region, multi cloud and multi at services. We also discussed the cap crow drain for VM container migration. And then we also took a quick look at the 5G core service based architecture. Obviously these are very, very deep topics and we could essentially do a talk on each one of them and there are many, many other use cases as well. But the key thing I think is we need to remember that service mesh itself is and we need to continue to evolve as a paradigm to support telco edge and other new use cases. And so in that we look forward to collaborating with all of you on service mesh evolution and innovation as well. Thank you everyone.