 There are a bunch of questions I have, but before I start off with one of the questions that an audience member had sent in from YouTube. His name is Arvind Padmanaban and he's asking that if I am using a cloud setup to run my application with multiple regions, wouldn't that make the CDN redundant? It's a great question. Look, we spoke about three things that CDN brings to the table. One is performance, you have security, and you have the reliability aspect of it. When you're using a multi-cloud setup or you're mirroring your site or you've distributed your site across different clouds, you're essentially, again, building out, in the e-commerce analogy, you're building out three smaller warehouses but fairly large in terms of the warehouse size to cater to pockets of demand. The idea is to see if a CDN can further distribute the content closer to the end user. That's a great benefit. On security, simply put, you don't want to deal with all of these TDoS attacks and malicious requests at your cloud. Even if you have the capacity, you've scaled to cater to a business requirement and you don't want to use precious resource, which probably costs a lot more to deal with a TDoS attack or a web application layer attack. And the redundancy question doesn't really go away. Yes, there are different ways in which you can improve the reliability. But essentially what you're saying is, okay, one of my cloud instances goes down. I need to go to the alternate location. That's in fact handled better on by using a CDN in the sense, if you are not using a proxy and that proxy need not be a CDN, it could be any proxy. If you're not using a proxy, if you have multi-cloud, you have to actually get a failure for your application to retry. Using a proxy in between that proxy can quickly retry without having to respond to the application or user. In all cases, using a CDN sort of augments what you already have. Multi-cloud has its place and a CDN is used to push the content closer to the end user and improve performance and reliability. All right, thanks for answering this. If anyone has any questions either on YouTube, please share them now. If you're on Zoom, you can feel free to raise hands and ask questions. In the meanwhile, I'll take some of the questions that I have collected while I was going through the entire presentation given by Satya. One of the questions that I have, or it's a mix of a comment and a question, which is the process that you showed or told us for setting up, that is mostly considered as a reverse proxy CDN, right? So what are the other types of CDN that are out there? Is there anything that you can shed some light on? And what's the difference between this reverse proxy and the others? So the edge CDN, so these terms are all there, right? Edge CDN and reverse proxy CDN and others. Can you just differentiate? So basically, I was talking to you about how CDNs evolved. If a CDN is handling both static and dynamic content, it falls into the category of a reverse proxy. An edge CDN could handle both the classes of content, both static content and dynamic content. By virtue of sending the traffic back where it cannot respond, it becomes a reverse proxy. And you could have CDNs which just have the ability to host content and deliver. And this is true. A good example for that is GitHub Pages. You host it on your Git repository and it's fairly static delivery from that point on. So that's, again, a good example of these two different classes of CDN. But more or less there's a lot of overlap between the capabilities. All right, okay. One question that I had when you were showing the dynamic DNS because, as you said, it's going to use, CDNs are going to use a CNM record. And the CNM record are going to resolve it to different IP addresses based on eventually a different IP address based on which region you're coming from. So who handles this logic? Is it handed at the DNS server level or is it handed at the CDN level? How does that work? That's a great question. So, I mean, when it comes to DNS, there's a clear demarcation in terms of two classes of infrastructure. The first one is a DNS client. Now, there's a DNS client in probably every layer in the workflow. You have a DNS client in your browser. There's a DNS client in your operating system. You have a static host file in almost all operating systems that overrides that client. And beyond that, your Wi-Fi router can also have a DNS resolver. Your DHCP is getting to some of those capabilities. Beyond your home Wi-Fi, there is the ISP and almost all ISPs have a DNS server because in the absence of one, when you wanna go to Google.com and you're connected to an ISP, it just wouldn't work. So all of those are your local resolvers and there's a chain of them. And when a particular DNS record is not available in that chain, it simply goes up all the way to probably your ISP. And this is where you have alternatives as well. For example, there could be an open DNS that you've set up. The ones that are provided by your ISP or you could use publicly available DNS resolvers. There's one provided by open DNS. There's one provided by Google. And there are a lot of other options available. All of these are intended to speed up the local resolution. But beyond that, for example, haspeak.com, it has, it's actually those DNS records are hosted somewhere. We went through that example where it was hosted on Route 53. So in that example, or today, all of Haspeak's DNS infrastructure is handled by Route 53. So Route 53 infrastructure is responsible for that. Similarly in the CDN equivalent, haspeak.com, the DNS responsibility for just that record still lies with Xenab and in this case, it's hosted on Route 53, so Route 53 manages it. If you use, I'll explain this by using Akmai as an example. So if Haspeak was an Akmai, it's likely going to go to an xv.net CNAME or an hp.net CNAME. And that record is the responsibility of Akmai. So different CDNs have different CNAME targets, and that's actually publicly available information. If you go to a web page test, they have a CDN finder tool. And you can also go to GitHub. I think the list of CNAME targets are available in one of the Python files. So whichever is the CDN provider, Akmai or CloudFair or someone, they will eventually give what should this CNAME translate to, which range of IP addresses, or eventually it'll actually give only one IP address, I guess, to eventually to one client. Because I don't think a client will expect multiple IP addresses to come in for a single domain CNAME, right? It can. So at least as far as redundancy calls, you could expect multiple IPs for a single record. What really happens is, if one of the IPs are not reachable for any reason, that forms the first failover mechanism. Got it. Cascades to the next one. Yeah. All right. Another question, I'm just starting out with some basic ones over here. How does the CDN work internally in the sense that you gave the example that there's one warehouse versus multiple warehouses? Now, if that one warehouse becomes our origin, then do each of the multiple CDN servers start making requests to our server to fetch the record or do they keep sync between themselves? That's a really good question. So different CDNs handle it in different ways. In almost all cases that I'm aware of, a CDN will make a request to your web server when there is demand for it. There are some outliers or some different strategies that you may use in common cases, but in almost all normal use cases, when there's a request for a particular piece of content, the CDN will go ahead and fetch that content. If it's a cacheable content, it will cache it, and if it's a non-cacheable response, it'll go ahead and serve it back to the end user and not store it on its servers. Right, but how do they sync between their own servers? So, okay, that's a good question. I can probably go over that in detail in terms of what Akamai does. So Akamai has a very distributed setup, and we essentially have, so we have tiers of caches. So we have a cache which is probably deployed in every ISP. So if I'm sitting in, if my local ISP is ACT broadband, there's likely an Akamai cache in ACT broadband. And there's, for static content, there's also going to be a cache region of Akamai that's closer to the Haskeek origin server. That's E2E networks. And when I'm trying to reach the Haskeek server, I'm going to end up in the path of the parent cache. And if it's available in the parent cache, I need not go back to the Haskeek server to fix that content. If it's a cacheable content, I'll know that, okay, fine, this content is available in the parent cache. Let me take it downstream. In all CDNs, in fact, let me put it this way. I don't know of a CDN deployment where the reverse is true, where they explicitly go ahead and sync between them when there is no demand. What different CDNs do, and what Akamai does really well, is when there is a request for a particular piece of content, the first thing to do is figure out if that content already exists within the network. And there are different strategies to do that. The simplest way is there is a parent cache. Go ahead and check if it's there in the parent cache. If it's not there in the parent cache, go back to the origin. That's one way to do it. But Akamai at least has many ways to do it. Different CDNs do that differently. All right, and how do CDNs decide if something is cacheable or not cacheable? So, and that's a good question. And I think it's true for almost all CDNs today. There are two fundamental ways in which CDNs decide whether a content is cacheable. First one is response headers. And the second one is when you will set up a particular CDN conflict. You can explicitly say what content is cacheable and what's not. Or you can completely rely on the response headers that you would have to figure out if a particular content is cacheable or not. And set those headers on engine X. And that sort of trickles down and CDN will honor that caching header. And I actually threw all the way down to your browser. Even your browser actually honors cache-controlled headers. And if a particular piece of content is cacheable, you should leverage the right headers to ensure that there's downstream caching or browser caching that is caching in. All right, so just to summarize your point, what you're saying is that even if you're leveraging a CDN and you're joining the network or you've set up a CDN, your responsibility of ensuring that the CDN is actually caching the content still lies with the developer who has to ensure that the right headers are being sent for all the requests. Just a hooking onto a CDN is not the job done. In almost all cases, the most basic configuration or the starter configuration already takes care of setting up these rules. But if you want, you can take that responsibility back to your servers and handle that with header-based caching. Or you could simply say that JPEGs, PNGs, WebPies, CSS, JavaScript. Just cache it. Yeah, just cache it based on file extensions or the URL of the request. Just go ahead and cache it. Or even things like, okay, if the domain is images.haskeep.com, everything is cached. I don't want to look at anything else. Just do something very simple for me. So rules can start off in a simple manner or you can have granular control over how things play out. All right. One more point that I have is whenever you are doing a reverse proxy for serving dynamic content via a CDN, you in your diagram or the graphic that you showed, this is clearly resulting in two hops. One up to the CDN and the CDN for the requesting and then trying to get it. And if it's a dynamic content, it is still not cached. You'll keep doing it every single time. You also said that while that is happening, there is a content acceleration in one of the slides. But in my experience, I have also found that to actually slow down the page requests because it's, my explanation is it's going first to the CDN then the CDN request rather wouldn't not be better if I directly request to the original server. So how do you balance that trade off? I think I briefly touched upon it. But again, it's a very good question. Now, if you're the only person making a request, they're clearly two hops, right? First, I go about establishing a TCP connection doing the SSL handshake with the CDN server. And then the CDN server goes ahead and does the exact same thing with your web server. Now, let's assume a different case. In almost all real world environments, there are more people who are accessing your content. Let's say there's Soviet, you're accessing the content and after that, Zainab went and accessed the same content or similar content. And then I went ahead and fixed some additional content as well. All of us are trying to figure out what the next session is going to be. In this scenario, when the first request went in, there was a handshake between you and the nearest CDN server and the CDN established that connection with the web server. Now, when the second request comes in, I'm going to be connecting to the CDN server. But the CDN server already has a connection established between two your web server. So that connection is simply going to be reused. And then when additional people come in, they get the benefit of not having to establish the connection at all. So the connection is constantly reused and it works really well. Apart from just the connection aspect of it, connections was just one aspect of it. Like I said, different CDNs do this differently. But Akamai can figure out if going to your origin server from that CDN location, is it a smart thing to do by just following the BGP routes? Or is there an alternate path which is more optimal to reach your data center and server? So that again provides a lot of acceleration in terms of finding an optimal path. There are also optimizations within the network. You could have dedicated infrastructure or optimization from the PCP layer which handle this particular use case. And again, all of this essentially is to ensure that once a request has come to a CDN location, from that point on the request gets an express way back to your data center. And if the content already exists, then it's fairly straightforward to serve it off of a neural location. All right, so from what I understand from your explanation, the browser or the end client still has to open up two TCT connections in the sense that instead of connecting to the origin server, it's connecting to the CDN. So that's not where the saving is. The saving is in the path that is taken and the connection establishment from the CDN server to your origin server. That's where the saving itself comes in. Is the saving that much that... And if it's a dynamic content, I'm assuming that it will make discrete or distinct requests back to the origin server. So if two of us make simultaneous request to the CDN, the CDN will further have to make two requests to the origin server because it doesn't know what the content is going to be. Is the path optimization in itself and the fact that the CDN further has an open TCP connection sufficiently a substitute for the fact that there are two hubs, essentially. That is something that I have not experienced very well, but you're saying that this is something that you will experience only when there is sufficient traffic? No. Like multiple people accessing or... No, like I said, the hops part of it or connection reuse is one aspect of it. Finding an optimal path and in between you have TCP optimization. But I mean, I just touched upon features that are common across the board irrespective of which CDN you pick. But for the bigger, little bit deeper, CDNs tend to have... Or CDNs because they have specialized network. They tend to adopt technology that's going to improve your connection speeds much sooner. A good example of that is say things like TLS 1.3 or even your HTTP2 and then HTTP3 gets standardized. CDNs are probably going to be the first... Yeah, your browser is already supported. Now, as an individual or as a content owner or ops person, it's very hard to figure out when is the right time to sort of upgrade your infrastructure. When CDNs sort of roll that out, you can leverage the benefits of that upgrade seamlessly. And like I said, the second part of it or the second hop that we were talking about, it's already established. So you might not see the benefits of TLS 1.3, for example, because there's actually no handshake. There's going to be a connection to you. Okay, this is interesting. So you're saying that not only that there is optimization between the CDN and your original, but there's also a faster adaptation of technology and therefore there is a chance that between the end browser or the client, that request to the CDN is also optimized before you can adopt those technologies on your own origin server. So in both sides, there are some improvement and it is possible to get the entire thing faster. All right. Yeah, that's interesting. So I have one deliberate provocative question, which is, is CDN a solution for badly written code? Well, in some ways, I would look at it as CDNs being an effective tool to manage all situations effectively and investing a lot of effort on either rectifying the code that you've already written or putting in patches or fixes fairly quickly. Let's try, I'll probably take two examples to sort of showcase this. Now, what's bad code to use, might not be bad code to a different organization. The way a certain site was written could be either because of lack of people to pick up the tasks or they have a tighter deadline or their focus was not to optimize their site but meet the business requirements of rolling out features or that additional capability which needs to be rolled out to cater to the user demands or competition. You don't want to be focused maniacally on optimizing code all the time. Sure, if there are optimizations that give you a lot of benefit, you should absolutely do them. There are no substitutes for that. But for example, spending an unreasonable amount of time optimizing images. When you have solutions which do that for you, in my mind, it's as good as using libraries. Today, we use libraries when they are available. They are specialized to do a task. Just left it then. So at that point in time, it's not a question of whether it's a band-aid or not. It's more a question of what's the right solution. You can see this sort of play out in the security space in a very apparent manner. Now, security is a very specialized field even though individual developers are very technical. It's very hard to keep up with what's happening in the security space. And often, you're using libraries that are available and distributed widely when you're building out those applications. Let's say there is a particular vulnerability in one of the stacks that you're using. It's very hard for you to patch it. One, you don't have expertise in that library, so there's little or nothing you can do to fix it immediately. You have to wait until that library is patched and figure out what to do next. Once it's patched, you have to test it, put in your fixes, and then figure out if that's the right solution for you. The attackers are not going to give you that amount of time for you to slowly wait and watch and then roll out fixes. So security is one of those domains where a CDN is not just a Band-Aid fix. It's, in fact, very necessary and critical for day-to-day operations. And the only place where you might, where this argument sort of will make sense is, you know, in the days of, in the old, let's say, olden days, a couple years back, front-end optimizations were handled very differently. Today, developers don't have to think too much about optimizations on the front-end. Your browser does a lot of it for you. So I think that's actually an interesting way technology evolves. Like, for example, I think one of the things that all of the browsers have done really well is handling lazy loading as an example. You have browser tags which handle lazy loading for you. In the past, there were CDN capabilities that used to dynamically optimize your pages for lazy loading, but none of that is really relevant. So I think the ecosystem does a very good job of constantly evolving. If something is broken and needs to be fixed today, that's a business critical requirement. So it works really well. Some of the other optimizations on the network side are things no amount of good code can really help with. Like, for example, if a request has to reach from point A to point B, it is going to take a fixed amount of time, and if you can reduce that, that's performance benefit. There's no substitute there. Right. Okay. All right. I'll ask one final question, and this is something that I've heard on Twitter a few times, and people have said this in different, they expressed it in different ways. What are your thoughts about arguments like, okay, CDNs are like this man in the middle. And there could be, if they are exposed to risks, or if their conduct is not good, then you take down a lot of internet together or a lot of sites get affected by it. So while you might say that one could argue that it's giving you a lot of security benefits, it's also becoming a big single point of failure in many ways. So what are your thoughts about this before we wrap up this? That's a great question. I can only speak for Akmai and in my personal capacity. So in my opinion, I think this is a serious problem. And all CDNs take a lot of care in rolling out features they take a lot of care in their infrastructure. Almost all CDNs are distributed. Akmai has a highly distributed system. So to put it differently, at least when I think about Akmai, one of the things we do really well is ensure that at no point in time, the entire platform is affected. And this starts off with the way we handle infrastructure, the way we handle software updates for all of that. It's rolled out in a careful manner because we have a lot of customers. So what I was saying is that's a great question. A lot of CDNs, when we roll out infrastructure changes or even software updates, we take a lot of care and pride in ensuring that it rolls out seamlessly. And especially if you... I mean, I can talk about Akmai. We take extra care in ensuring that there is reliability and redundancy built into every layer, especially true for our networks. One of the things that Akmai is really proud of is the fact that the platform has never actually been down for a really, really long time. I think we had an advantage like a very long time ago. But outside of that, the Akmai platform has never gone down. And that's one of the biggest reasons a lot of large customers trust Akmai to deliver business-critical traffic all the time. And yes, it is a problem. But I would say to sort of round it off, you really have to ensure that there are some redundancies built into it. You're picking a CDN that has got a good track record of not having too many of these issues. And yeah, sometimes things can go wrong, but that's part and parcel of the game. Yeah. All right. That's a fair answer. Thanks a lot, Satya. It was great having you. We stretched it for about 15 minutes. But it was... No, that's good because we covered a lot of things and that too slowly because it will help people understand all of these things. It was a pleasure. As I said to everyone else, we have a localization talk as well as a AMP talk schedule for the next two weeks. So if you want to learn anything about how to make sites in local languages and stuff, we are going to discuss that next week. And the week following that should you have your website in Google AMP, essentially. If you want to put in a submission and you want to give a session similar to what Satya just did, please put in our proposal in hasgeek.com. We look forward to your proposal and we'll put out the schedule for more weeks after that soon after. Thanks a lot. Thanks, Satya. Have a good day. Have a good weekend. Thank you for having me. Take care. Bye.