 Hello everyone, thanks for coming to my session of watching my session. So today we are going to explore how a global cloud infrastructure can be used to build a geo-distributed applications. We'll discover what a geo-distributed application means in practice, and also we'll see how to build those applications in such a way that all of the global cloud infra is used to make them faster and make them compliant with different regulatory requirements. A little bit about myself. My name is Denis Magda. I've been working with distributed databases and high-performance computing systems for the last eight years. Right now I'm the head of developer relations at Ugabyte. I'm a distributed database that is built on PostgreSQL, and before that I was contributing daily to the Apache Ignite, which is another distributed database, but designed for high-performance and memory computing. Before distributed systems and high-performance computing solutions, I was on the Java engineering team at Oracle. I was developing GVM and JDK, and my professional career started at Sun Microsystems, where I was advocating for various technologies, including NetBeans, Java, Solaris, and much more. So the agenda for today will include three primary topics. We need to understand why this whole conversation makes sense at all. We will see how the global cloud infra can be used to solve several problems related to geo-distribution. Next, we will remind ourselves of how the global cloud infrastructure looks like. Most of us understand that, but anyway, it's often better just to put everyone on the same page before we dive into the details. And after that, we will see how to use this cloud infrastructure to solve the problems from the first section while we are building geo-distributed apps. Alright, let's go. So when the global cloud infrastructure is useful? First, use case is when you need to equalize latencies for users around the world. So let's, you know, we are designers, we are architects, and we are application developers. So let's today build a new application. I am inviting you to become my co-founder. We founded another startup company, and we are building a Slack-like messenger. Many of us use Slack daily to collaborate with our colleagues and partners. And let's say that we decided to build a Slack killer. Probably we just reinvented the wheel, or probably we might have some killer feature in mind, but on a serious note, Slack is a good example of a geo-distributed app. And let's say that we are building this application right now together. And as a startup company, we have to run fast, right? And our first version of the application will be running in one cloud region, probably even within a single availability zone. We will have an application instance, and we will have a database node that will keep all the messages, channels, and work spaces. Once we launched our first version of the application, we've got the first three users. We certainly have more, but like these three users will help us to solve different architectural challenges. So welcome. Mrs. Blue, a financial advisor from New York, and she has a friend, Mr. Green, the friend from Hero Time at College. And he's a scientist who lives right now in Berlin. And also two of them know Mr. Red for a while. Mr. Red became a writer and moved to Mumbai in India. And one day they came together and decided to use our Slack like corporate messenger just to discuss several topics. So and now let's discuss like what would be the latency time, so Mrs. Blue, Mr. Green and Mr. Red will perceive the speed and performance of our application based on their location. So for Mrs. Blue who lives in New York, the performance can be as low as 5 milliseconds for the roundtrip. If to take Google Cloud compute engine as an example, then let's say the latency time between availability zones within one region is around 5 milliseconds. So Mrs. Blue lives nearby our application deployment. However, for Mr. Green who lives in Berlin, the latency time for every request will be much, much slower. So on average, if you take again Google Cloud as an example, the latency can be around 100 milliseconds for your request to travel from Berlin to some like to New York or to North Virginia in the United States. For Mr. Red, the situation is even worse than for Mr. Green because right now he's in Mumbai and his requests have to travel over the land and then under the ocean. So generally speaking, like when Mrs. Blue will use our mobile application or will be interacting through the web interface, she will be perceiving our application as fast because like she lives nearby the primary deployment of the instance and the database. Mr. Green and Mr. Red, you know, from them we might hear from time to time, you know, claims such as why this application is slow. I cannot use it, especially when I'm trying to load this messages history or when I want to post something in a channel. All right, because like they live far away from the primary database deployment. That's the first situation, right, when our cloud infrastructure, global cloud infrastructure can help to solve those prime types of problems and the latency, you know, needs to be equal around the world for your application. What's the second use case for the global cloud infrastructure? Well, application sometimes it needs to comply with data regulatory requirements. So let's take this example again, even though my primary customer base probably right now for our geomessenger is in the United States because I was startup and I put most of the marketing money in promoting this geodistributed messenger there. But I already have one customer, one client from Europe and that's Mr. Green and if we know like the European laws and you certainly know them, right, we have JDPR, General Data Protection Regulations, which requires me, the application creator to keep the personal data of Mr. Green in Europe and not in the United States. So I'm violating the law right now, probably in the beginning when I'm a startup and I just have a handful of users in Europe that's not a big deal, correct? Because I mean, nobody is going to look into this and I'm not busy with that. But like once I acquire a certain amount of customers in the United States and decide to grow, expand to Europe, then I have to comply with JDPR, right? And Global Cloud Infra can be useful here. We all heard about JDPR many times, right? And usually JDPR is taken as the first example when anybody talks about compliance. But have you ever heard that India, in India, they have similar regulations and this one is related to payment. So in India, the Reserve Bank of India put in order a law that requires you, the application owner, to keep all the payment data of Indian citizens and residents in India. So what it means for us, for my geo-distributed application. For instance, at some point in time, we decide to incorporate some payment feature. For instance, Mr. Red wants to transfer money to Mr. Green or Mr. Red wants to donate this money through my interface to some volunteering organization. And while the processing of this payment can take place in the United States, the actual payment data have to be stored in India. And right now my application doesn't support this, even architecturally, right? So that's the second problem, right? That can be solved with the Global Cloud Infrastructure. We have infrastructure spread around the globe and we certainly can deal with different data regulatory requirements and we certainly can make the application perform fast around the world regardless of the user location. So now, before we start looking into the specific design patterns and considerations, let's remind ourselves about the Global Cloud Infrastructure. So as you know, Global Cloud Infrastructure consists of regions. Every cloud provider has around like dozens, like 70 or 80 plus regions around the world. They are in Europe, in Asia, in Australia, in South America. And all those regions are interconnected through a regional network which is pretty good, it's pretty fast. But the latency, the round-tree latency time between those regions varies. So whenever you are deploying, designing your applications for multi-region, use cases, then you need to decide accurately where my application instances and database nodes are going to be deployed. For instance, if to take Oregon and South Carolina, US West and US East as an example, the round-tree latency of the country is pretty good, it's like just 65 milliseconds. Unless you're building trading platforms, I mean like 95% of the applications can deploy, let's say, their solution around the country. But if you want the round-tree latency time between South Carolina, US East and Frankfurt, it's like 97 milliseconds, between Mumbai and South Carolina is worse, 270 milliseconds, which means that the regional network is fast, but we need to decide when we are deploying, we are picking the best regions for our solution. Also, every region is comprised of availability zones. In the availability zone, you can think about them as separate locations within your region. It can be a dedicated data center or it can be not a dedicated data center. Usually, this information is not visible to us, but generally, what we need to know about those is that at least every region has three-plus availability zones, and this is, by default, is done for the reliability and availability purposes. If one of the zones goes down there, your application can function across other zones. If you took care about this, if you actually deployed your application instances across all of the zones. And from the networking perspective, all of the zones within your single region are interconnected through a high-performance network. And if you take Google Cloud Platform as an example, the latency time is under 5 milliseconds, which is great, right? Also, just to conclude this, every cloud vendor comes with so-called local and edge zones. Those allow you to expand the cloud infrastructure to densely populated metropolitan areas. For instance, you can find such zones and edge zones in London, in New York, in Seattle, etc. And usually, those are four applications that require single-digit millisecond latency. Gaming, streaming, I mean trading. One of the typical examples. And if you take this now, let's step back once again, because we rushed through this global cloud infrastructure. What can we do with it? Let's try to understand. If you have all those regions or zones available through the planet, and if all of those zones and regions are interconnected through a pretty good network, then what can we do? We can build and architect applications that provide similar performance, which means experience to your users. What we want to do for my application, like for our geo-distributed applications, we want the speed and experience to be the same for Mr. not only for Mrs. Blue from New York, but for Mr. Green from Doreen and Mr. Red from Mumbai. I want all of them to say that, wow, this geo-distributed messenger is fast, right? And it will mean that those people who built this application, they thought about me, right? I can enjoy this. I will be advocating for this application. Also, it will allow us to comply with data residency requirements. For instance, if I know that I have regions in Europe, why don't I provision my application instance in Europe? Why don't I come up with some solution that would allow me to keep the user data of European citizens in Europe, not in the United States? We can do this. We can comply with the regulations. And another good benefit, wonderful benefit of global cloud infrastructure is that you can tolerate cloud failures, including major regional outages. That's not the topic for today's conversation. If you want to learn more about it, just pin me, send me a message, email me, and I will provide you with extra details on how to use this global cloud infrastructure for extreme fault tolerance. All right. Now, let's remind ourselves. We defined two problems in the beginning. We have already created the first version of the application that right now works in the United States. And the speed of this application is reasonable only for Missus Blue who lives in the United States. But for Mr. Green and for Mr. Red, the situation is, let's say, slow. They will talk about my application as a slow one because they live in Europe and India. But also the second one problem is compliancy. We need to store the payment data of Mr. Red in India, not in the United States, and all the personal data of Mr. Green in Europe and not in the United States. But also what we learned, we have global cloud infrastructure, regions, virtual machines, storage around the world. So now let's see how we can put everything together and solve all of those problems. How can we build geo-distributed applications? First, let's define what a geo-distributed application is. The definition is simple. That's an application that spans multiple geographic locations for high-valubility, compliance, and performance. And on this picture we already show an extreme use case when you truly build a global application that is used by customers around the world. But your geo-distributed application can be, let's say, just an application that runs across multiple availability zones within a single region. That's going to be your first version of your geo-distributed app. Or it can be just a geo-distributed app that runs across multiple regions in Europe, which is also a geo-distributed application. But this is the extreme use case and we are discussing their extreme use case today. So now let's talk about a typical architecture of a geo-distributed application. How is it different from a regular application? And under the regular application I assume an application that we usually deploy in one region, in one availability zone, and we just hope that nothing will happen to this availability zone and my application will be running all the times. Similarly to regular applications, a geo-distributed app comes with a data layer, with an application layer, and a load balancer at a minimum. Certainly you can have APIs, message buses, and blah, blah, blah. But today let's take the simplest architecture, data layer, application layer, and load balancer. And in case with a geo-distributed app, the data layer has to be distributed, because my customers are distributed, they live around the world literally. My application instances also have to work and run across multiple locations, so that if, let's say, Mr. Red from India opens my geo-messenger on his mobile phone, I don't want this mobile application to go to a backend service that runs in the United States. Why don't I have, let's say, a backend instance running in India? I certainly can do this. And global load balancer, you'll see how to use it to forward user requests to the application instances that are nearby your physical location of the customers. So now let's put this architecture in mind. Remember, we had the first version of our application running in the United States, and that's where we have the database nodes. As an application developer, we do care about the application layer first. That's what excites us, and that's what drives us. That's why if the suggestion for geo-distributed apps is to build instances around the world, then let's deploy an instance of my backend in Europe and in India. That's what I did. Good. Next one is if I have multiple instances of my application running around the world, then does it mean that my front-end and that's my mobile layer and any API layer need to remember all of those IP addresses, all of those standalone instances? You certainly can do this, and that probably would be your default solution, but that's cumbersome. You truly don't want your mobile application or your front-end to remember locations of those instances in India and the United States or in Europe. It doesn't make sense, because those can be changed. A typical solution to this problem is the global load balancer. The beauty of it is that it comes with the AnyCast API address. So your front-end and your mobile applications will be using this IP address to send requests from the users. But then once the global load balancer receives the request, based on the user location, the load balancer will forward this request to the nearest application instance. For instance, Mr. Green opens my geo-messenger, and he wants to read all of the messages from one of their channels. All these requests from the mobile application that is running and connecting from Europe right now will go to the global load balancer, and the global load balancer will say, alright, I have an instance of the application in Europe nearby Mr. Green, so let that instance to process this request. And this is what will happen. So the request will not travel to the United States any longer. The load balancer will take care of this. But from the speed it's already a step forward, but from the data regulatory requirements we have not solved anything, because if this application instance from Europe, from Mr. Green, still needs to go to the United States, then for the messages it's fine, right? It might not be treated as a personal data, but for any personal data, we are violating GDPR requirements, right? And we don't want to do this, at least, you know, with our future expansion plans. So which means that we also need to take care of the data layer. We need to make sure that the data, at least the personal data of Mr. Green and payment data of Mr. Red is stored in their countries, and not in the United States. And if to put it graphically, at least, as of now, for the sake of simplicity, let's just make sure that somehow we arranged our data layer in such a way that we have the data required data available from Europe and from India, and not, we don't need to go to the United States all the time. But with the data layer, there is no any silver bullet solution. With application instances, guys, it's all clear, right? We can deploy all those instances, we can use Terraform, we can use any other solutions to simplify and automate the deployment, and we can use the global load balancer. But with databases, there is no, like, one for all solution. So that's why we need to review them separately. Because generally, when I'm talking about multi-region database deployments, I'm defining four different types. And you will pick one of them, you can just select several of them, but you will be selecting based on your use case. So the first one is single multi-region database cluster. That probably runs across, let's say, Europe. Then you can have a single multi-region cluster with reader applicants, and that one is good if you want to accelerate reads in distant locations. Also, when you need to boost and make sure that performance for both reads and writes is equal around the globe, or when you need to keep the data, right? Like for the data regulatory requirements around the globe and proper locations. Then you can use multiple standalone database clusters. You'll see and you will recognize them, but it's highly likely that you used or came across this type of deployment before. Or you can use this geo-partition database cluster. It's a sort of a dark horse for us. Something new, something that is supported by a new type of databases, and we will review it also in the end. Also in the end, I will show you a quick demo for one of these deployment types. So two quick slides about UGA-byDB. So for two reasons. I work for UGA-byDB and I build the geo-distributed applications using it. And second of all, all of those deployment types that we are going to review. I am 100% certain that they are possible with UGA-byDB when it comes to other databases. I cannot speak for other databases. It's up to you to decide and to review. So what's a quick introduction to UGA-byDB? UGA-byDB is an open source distributed SQL database that you would usually use for OLTP workloads. It's built on the PostgreSQL source code and it's inspired by Google Spanner. So at its core, it has the query or compute layer and it's built on PostgreSQL. It's literally the PostgreSQL source code. And below that compute layer, we have our storage layer. It's distributed. We support sharding. Your data is going to be sharded across as many regions as you like. It's going to be load-balanced. It's all consistent with support transactions and RAV consensus protocol or data replication. When it comes to the compatibility level with PostgreSQL, how much are we compatible with PostgreSQL? Everything that you see here in green, it's a PostgreSQL source code. And that's our query engine of UGA-byDB, a compute engine. Everything is green. It's taken just from PostgreSQL. Repository is easy and not changed. But the red sections are our optimizations. We had to optimize the planner, the executor and some other components to make sure that they can generate the most efficient plan and get it executed over our distributed storage engine. That is unique to UGA-byDB. Now, let's see how we can build and review several database deployment options. First one is single multi-region database cluster. Imagine that I have several regions, United States, West, Central and East, and you can deploy, stretch your cluster across those regions. The latency time, you know, it's fine for my German messenger application. It's going to be around, let's say, 65 milliseconds between East and West and less between West and Central. Usually, you would use this type of deployment for the sake of fault tolerance because if one of the regions goes down, your and my German messenger will remain operational because the other two regions will be running and all of the data will be there in the consistent state. Also, what's important about this type of deployment? Each node is equal, meaning that we don't have a notion of their leader and follower nodes. All of them keep a subset of the data and they both, all of them, serve reads and writes. The other use case, and this is probably what I will be doing anyway, you see that in the United States right now I stood up a multi-node cluster and the nodes are running across several zones. I did this, you know, from the high availability standpoint. I know that even the regions go down for many hours. That's why I don't want my messenger to be unavailable while a region isn't unavailable. But now I'm expanding to Europe. Eventually, we met and we said now the time has come to go to Europe. So let's at least accelerate and make sure that reads for Mr. Green are faster and the same for Asia for Mr. Red. Because at least most of the time what people do, they open up the messenger, they read, they read, right, they read the data. And only in a few occasions they start chatting and sending messages. But most of the time you read. And it's easy to implement with a geodistributed database. You just need to attach your reader-applicant nodes to your geodistributed database cluster. After you do this, the latency for Mr. Green, for Mr. Red and for Mrs. Blue for reads will be low, probably like around 5 milliseconds under 5 milliseconds at least in the Google Cloud. But now, if I want to accelerate both reads and writes, because at some point, let's say, like in a couple of years, right, when my startup, our gel messenger became a hit, now I want to make sure that, let's say, the speed for writes is good. And also, finally, I also have to comply with this data residency requirements. I want to make sure that all the personal data of Mr. Green that is stored in my messenger is located on the nodes in Europe and not in Asia or in the United States. And here is one of the solutions you can deploy separate clusters in different geographic locations. Like, you have a dedicated standalone cluster in the United States that will keep all of the data for those who live in the United States. And you have, let's say, the same clusters in Europe and Asia. We use different color scheme to highlight this. But for my gel messenger application, for many cases and for many apps, that would be an architecture to go with. But for my gel messenger, that's not a good solution because I want Mr. Green to read it. Because my application layer, my backend and my mobile application, they need to maintain connection and points to all of those separate databases. And it will be hard to join data across those databases and especially run transactions when necessary across those databases, which means that I want to have something simpler. I want to have a single database connection endpoint. And when needed, I want Mr. Green to pull data from the United States. I want Mr. Red when needed to send, let's say, and store some data in the United States if it doesn't violate the regulations of the Reserve Bank of India. Why not? It's fine. That's a reasonable requirement. And here you can take advantage of a geo-partition database cluster. So on this picture, we have one single database cluster that runs around the globe. It's just one connection. It's going to be just one connection and point for your application. You certainly can connect to every node, but usually you can use a load balancer or you can use, let's say, just four. If you want to send, your application wants to send a request that joins data stored in the United States and Asia, you can do this through one connection. But how this geo-partition cluster works, how the data is arranged, you as an application developer needs to introduce a special column to your data model. In my case, the column is named region. And based on the value of this column, the database will automatically place the data in one of the locations. So for instance, when Mrs. Blue sends any requests, all her messages will be labeled with the region equal to the United States. And the database will put all this data in the United States. When Mr. Green will be sending any messages or any data, all of the data will be labeled with European Union, EU in the region column. And also the database will put data in the proper locations. When you request the data, you also need to define this region and your backend will go just to those nodes that keep the data for the European Union. So now let's quickly do a break because I want to show, we have 10 minutes left. I want to show you one of these deployment types, probably the simplest one with read replicas. So what I have guys, I have a database cluster running in UGBIDB managed. I don't want to manage the database, I usually deploy it here in the cloud. So I have my primary cluster is in the United States East and I have three nodes and every node is running in a dedicated availability zone. And I have a read replica in Asia in Taiwan to be particular right now. And what I'm doing right now, I'm going to connect to that cluster. I'm going to connect, open a connection from one of my nodes in virtual machine in Asia. And from that machine, I will connect to the primary cluster and I will connect to the read replica node and I will show you the difference in speed, like the latency, right? Because that first problem that you can solve with geo-distributed apps. Now I'm connecting to that virtual machine in Asia from this terminal window. And also I'm opening another connection to the same machine, but from another window. Why do I need several? Because from this connection, I will be interacting with the database node in the United States directly. While from this one, I will connect to the read replica node that is deployed also in Asia. So all the requests, at least for reads, have to be faster, right, for this location. This is my expectation. So let's do this. From here, I'm connecting to U.S. node. I'm using PSQL connection. All right, the connection is opened. But here, I'm connecting to the read replica node that is in the same region. Now, let's do this. For the read replica node, I need to set up extra parameters. These two so that I can execute read-only transactions through this PSQL connection, through this application instance. And also I'm allowing reads from followers. For me, the read replica node, it keeps a consistent copy of the data, but this copy is not the latest. Because the primary node that I'm connecting to from this session sends all of the updates to the read replica node asynchronously. All of those updates arrive when they're merged and applied so that your data is consistent in the read replica node, but still not the latest. And this parameter basically says that I'm okay to read this data from the read replica node. Next, let's switch to this messenger schema here and there. If you check the structure, we have multiple databases such as messages, channels, workspaces, everything that you would usually find in Slack. And now let's run this query here and there. Surprisingly, the query to the node in the United States executed faster than the one through the read replica node. And it took more time to execute this query through the read replica node because a replica, during the first query, etc., to the specific table, needs to load metadata, some information from the primary node that are in the United States. But usually, what's good, you do this only once. Because right now, let's do this. Let's enable timing function so that we see the latencies. When you execute this query here, the latency will be around, let's say, 200 milliseconds because that's fine. The request travels to the United States. But now, when I execute the same request from here, and now I need to also enable this timing, it's going to be, let's say, 9 milliseconds versus 190 milliseconds. Let's say 10, 15 times faster just because I'm reading the data from the read replica. The first request, which was slow, don't bother too much about it because usually your application has a connection pool and in that connection pool, you have all of those connections that are already warm-up. So you can warm-up your connections when you start. And also, UGByDB team always introduces several optimizations. So usually, in a few releases, you will see that even the first requests through the read replica will be fast. Next, what's also interesting about read replica nodes through the UGBy is that you can do writes. You don't need your application that is deployed in Asia. It doesn't need to maintain a connection point to a node in the United States for writes. You can use the read replica for writes. The read replica will forward all those writes to the United States. To do that, let we need to enable this for the next request. We need to allow a read-write transactions through the read replica node. And now, when you do this insert, the first insert through the read replica node will be slow because I am using a message ID sequence. And for that sequence, I am booking requesting a range, like for the next 1,000 IDs. And that also sequence is deployed in the United States. That's why it took more time just to execute the first request, the first insert request. But when you insert, let's say, the next few, like let's insert the next one, it's going to take, let's say, around a second, or like 700 milliseconds, 500 milliseconds, because right now I have these IDs here and my read replica node basically forwards all of the updates to the primary cluster, which is fine. And then certainly it will receive back all of their changes. So let's again, I no longer need to write. Let's say I'm right now saying that I'm going to read only. And now, when you read all this data back, you will see that this data will already be in my read replica node cluster, right? It's like 300 milliseconds, 3 milliseconds. Yeah, those are the records that I've just created. So like they were written to the primary node and then the primary node replicated and synchronized with them to my read replica cluster. So wonderful. So this read replica deployment is for those of you who need to attach, who need to boost the speed for reads in distant locations. So take advantage of it. Writes, as you remember, are slower through the read replica. So if you want to accelerate writes, you know what to do, right? You need to deploy a geo partition cluster for your application, or you need to can deploy multiple standalone clusters. That's what I usually do. Wonderful. That's enough of the demo. Let's quickly wrap up with this presentation. You will get the slides. And then what type of database deployment do you need? So use this chart. If you need to incorporate it, need to take care of the compliance, like data regulatory requirements, then you either need to deploy a single database, then ask yourself, do you need to have a single database instance for your application? If yes, then that's going to be geo partition cluster. If no, if you don't need to, if like easily and efficiently query multiple databases, then you can deploy multiple standalone databases. If you don't care about the compliance and then ask yourself, do you need to boost reads or writes? Or do you need to boost both reads and writes? If you need to boost both reads and writes, then you either deploy geo partition cluster or multiple standalone clusters. If all you need is just reads, then deploy a single multi-region cluster for one of the locations, such as Europe, and then attach reader replicas for distant locations. Good. Also, another important takeaway from this presentation. In this case, we were taking the extreme use case when my application instances and database nodes are deployed across the world. But your first version of the application doesn't need to run across the globe. Literally, you can run, you can architect with the vision that at some point in time, it can span multiple regions across multiple continents, but it's okay to run the first version in a single region across multiple availability zones. Because with this deployment, you will be able to withstand and tolerate zone-level outages and you will already implement your application in a geo-distributed way so that when the time comes to expand your application instances and database nodes to various regions or continents, you will be able to do this easily without any re-architecturing cycles. If you want to learn more, there are also the books that I usually recommend on this topic, Architecting for Scale by Orali and Designing the Intensive Applications. If you'd like to learn more about the Distributed Data Layer, how to use it for these types of applications, probably the best places to start is Ugabite Developers Hub and Ugabite University. So, thanks for coming and we have a couple of minutes left for some more questions.