 base. So before we stop that story, we really want to look at, you know, why we decided to develop this database. In fact, you know, necessity is a matter of all inventions. So, yeah, for some reason, oh, I know why I'm so sorry. That's weird. It is there. Oh, it's back now. So, you know, we were looking at explosive data growth. So when we say explosive data growth, you know, you're looking at the likes of e-commerce, like social media. And the prediction is that in the next, you know, two years, we're going to reach about 132 zeta bytes of data globally. And another thing is, you know, if you're running e-commerce website, you're running something like Taobao, you're running something like Lazada, Shopify, Shopee. You know, one of the challenges is there's a massive amount of data volume. So the massive amount of data volume comes in many different factors. One is, you know, you've got product data, you've got a metadata, you've got, you know, customers doing feedbacks. And then you also got to do the handle the transactions. And nowadays, if you use a platform like Lazada, you use a platform like Shopee, you realize that every month or every now and then they come up with some promotion offers, right? Of course, the most popular one is W11, right? In fact, you know, the origination of this database, when the company finally decides to say, okay, enough is enough, we need to find some way to do it, it's actually W11 where, you know, one of the commercial databases that we were using 12 years ago, killed over and caused a lot of pain. And another one is, it's also like that stronger capability to handle the kind of data growth that you have. Traditionally, the, we call it monolithic database, but it's just how the database was being developed, right? So when we started to have the concept of database, we started with the main frames and then, you know, the mid, the mid-sized systems, the AS4 and so on. And then we developed into the PC era, where you started to have machines more like commodity hardware where it's running Linux. So Linux actually is the start of the revolution. And that's why I'm very happy to be at the open source platform where, you know, when I started my IT journey, I actually started with Red Hat Linux version one, right? And in that time, the database model is like, you know, I want to find some way to store data. And it's not that much of a data we want to store because during that time, you know, the computers are not that capable. You know, the memories, the storage, you know, we're talking like 40, 80 megabyte hard disk. That's what I remember, right? That was a really, really long time ago. So going from that perspective, the traditional solution is that I make it as, I can just have bigger, more performance systems, but eventually you hit some kind of roadbox where bigger is not always better, right? And another thing is, you know, then you have another solution is that, you know, I used to work for Oracle as well. And Oracle says, okay, you know, I can make it, I can make it scale, but in some sense, you know, it's still monolithic because, you know, compute, I can separate it. So what Oracle real application cluster means is that you can have a cluster of systems, but your storage is still shared, right? Your sessions, information is still shared. Your memory state is still shared. So you still have some kind of limitations. Of course, you have machines like Exadata, which is very, very capable. And you can scale to, you know, handle petabytes of data, but it's a really expensive, customized hard resolution. And of course, then you have sharding, right? So a very good example of sharding would be the likes of a DynamoDB from AWS. I used to work for them. It's a kind of sharding that you do the hash key and you specifically use algorithms to shard your data across a number of systems and you can split them and partition them as much as you want. But still, you know, in order to do that, it's a bit of a complication that you have to actually manage that yourself. So from that perspective, we think about this as more like a distributed requirements that eventually you need to distribute your data. You need to expand the way that you're going to store your data. And not only that, is that you want to find, we're still doing like partitioning, sharding of data and so on, but you want to like also automate it. And you want to have the ability to not just like doing read, write on one particular or a custom machine. So for some of the systems in the market, yes, you can do scaling, you can do horizontal scalings, like take AWS or Roa, for example, you can scale it horizontally. They have a shared storage system, but there's only one instance of read and write. So you have a one massive large read writer, but you have a lot of readers that you can use to scale your requirements. So for read only, you can scale it to quite a bit of a size, but for the transactions that you're writing in the database, you still have one single instance to do it. But of course, then you have other systems, like you start to have someone like Google Spanner, which is kind of in a very similar concept where now you can spread it into like multiple systems automatically and you can spend it into different regions as well. Sorry about that. So the pain point for us was we have a performance bottleneck. We really want to scale. And the challenge here is that because we want to do the scaling, we need it to be distributed. We scale very fast and on commodity hardware. Another one is that we need to eventually now we also think about customer says not only want to do the transactional. So this is the chicken egg issues all the time. Customer says I want to do transactional and then suddenly they say also we want to do analytics. So at the moment you do it in two places. But for us, we try to say in the same instance of schema and data tables, you can do the same now. We can locate. I have to go faster now. Yeah. So just to go very quickly on our evolution, we started in about 12 years ago, we started with Taobao. We decided to really have this self-developed distributed architecture. Initially supported Taobao, eventually we was supported Alipay. So now if you use Alipay, you use Taobao, the database backend is actually ocean based. And also if you look further forward and then we started to have our partners, our financial partners, they said, you know, this is great, you know, you can do like massive distributed transactions, financial transactions. We really want to use it on our own business as well. So then you have also the banks, the brokers and even insurance companies started to use us. And then when we move forward, there's more and more commercial partners. And overseas now, we've got payment partners such as Gcash and Dana, that's actually using our database platform, right? And very quickly, the way it works for us is that it's a distributed architecture. So every time you deploy an ocean based database, even for community edition, we can do it in one instance now, but it's basically you need to have three nodes and it's a cluster. And we can separate, have different tenants. And each tenant is an individual database. And you can decide which of these tenants, which one will be allocated as a leader. So leader means that this will always drive the whole cluster for the read and write requirements. And when you want to write to the database, you're going to transact, you will always hit the leader. But if you don't have, you don't have, you have eventual consistency, you don't have strong consistency requirements. You can always ask the system to direct the read to the followers, right? And then, you know, you can decide how you want to run it. And another really unique thing for us is that we are able to do real-time compression as well. So it's adaptive compression. So when you're doing transactions and you're storing data into the database, the database is just like, okay, what kind of data type this is, and apply a different kind of encoding or compressions into the database. So it's very common that my customers say, if they go for my SQL, they have a 10 terabyte database. And when they go to ocean base, actually it's only three terabytes because it's fully compressed. So one of the reasons why we decided to go for open source platform is that this is really where the world is heading. Okay, so more and more open source database is getting adoption. And open source database has proven to be a model where the more that the community is using the database, the more stable and more capable of that platform is. So the best example for me is coming in from the open source side is Postgres and MySQL, right? Because Postgres now is version 15, 16 now, 15, right? And MySQL now is on 8.0. And it's because the community is pretty strong and it's increasing the use and also the ability for the communities to point out, you know, where to improve the database, right? So we want to have that capability as well. So we are at the start of a journey of being like as part of the open source community. We started to project on 2021. And what we want to do here is they want to adopt the best practices for an open source foundation. So we're taking elements from like, for example, for Apache Foundation, we're going to have a tentacle oversight community. We're going to have a development group and user group. And what we want to do is that at the moment in China, there's a really strong open source community for Oceanbase, but we're also expanding it overseas. And this is where we are at in terms of being an open source platform, which we started with 3.1, where we open source 3 million lines of code and said, you know, put on Github and says, you know, knock yourselves out, you know, the source code is here, you can compile it, you can run it, you know, play with it. And then, and then slowly the ecosystem of surrounding the databases, we're opening it up as well. And just want to highlight some of those. It's like, you know, doing data migration, we call it Oceanbase Migration Services. There's also the Oceanbase Management Platform. So this is kind of unique in the sense that I have never seen like any other platform to say, I've managed when platform come with the open open space, open source database platform, right? There's also the IDE. There's a few of the ATS that we are opening, opening source as well. And most important is that we're also integrating with the open source ecosystem. So some of the ETL tools that's open source, we are contributing code into that. For example, for IDE, you can also use DBBphere. In DBBphere, if you download the latest version, you'll find Oceanbase inside there. And so that's the ecosystem. And then tools like later versions of the version 3, we're also looking at, you know, how do we handle like slow SQL, top SQL monitoring? And basically, you know, the ease of use of the open source platform as well. So recently we launched version 4.1, which is really exciting in the sense that, you know, I talk about initially we need to have like three different nodes and in this one cluster. So our open source user is saying that it's a bit difficult to have that much of resource to do initial testing and development. So we kind of re-architected the platform to, in order for you to run on a single instance of compute, but it's still a distributed architecture, right? So it still retains our design, but we are able to run it on a single piece of hardware now. Another thing we wanted to do in proper next, eventually we want to also take on analytic performance kind of requirements for a database where, you know, there are platforms such as VechChift, it's really strong on Colama analytics. We're going to put that into our platform as well. Yeah. And most important was if we continue to improve our capability to automatically distribute the transactions and the data query requirements towards the node itself, so that you don't have to do it yourself. Yeah. So in essence, what we wanted to do is to really share, to really go for standardizations. We want to be like the my SQL of the distributed database, right? So there's no such thing at the moment. There's still quite a lot of platforms available for the distributed database side, you know, relational distributed database. I think we're kind of unique that way. And in our face, we want to scale our user base as well. Yeah. Of course, you know, in order to, to, to get adoption, right, we are my SQL compatible. We are very, we are fully compatible for what 507, but we're getting more and more compatibility for 8.0 as well. So by the end of the year, we expect to have a full compatibility. That is our GitHub. And most important is that this is not a new database platform. This is a platform that's been production for quite a while now. Okay. So in order to start with the open set, open base community edition, 4.1, right here is the documentation. You can come in and take a look at, you know, what is the open base about, you know, what, what are we doing in the open source space and our community edition. And then I apologize at the moment. It's still recent in Chinese, however, you know, if you go on, you go on here and, and you, you find, you know, that means a community, community edition. You, you can download the x86. The ARM version of that is, is ARM specific probably through China, but at the same. But what I'm going to ask them to do is to also validate on, on ARM's in the international market as well. For example, AWS Geraton, right? So we want to validate that. So I'm going fast. I think I'm just okay. Great. So, so, um, so thanks for that. I'm just going to go back here so that you understand like what's the ease of use and if you have any questions so far. Great. Technical discussions. What consistency grantees does it provide? Consistent grantees. So, um, by default, it's eventual consistency, right? So there's a leader, there's two followers. We use Pestsoft's, um, algorithm to ensure that the data, uh, has high availability and also, uh, make sure that they are reliably replicated. Uh, but there are two modes to do that. One is eventual, right? So that you can use the followers reading. But if you want strong consistency, you can define that you want strong consistency. So either, uh, the, you go directly to the leader, which of course you have strong consistency, or you actually delay, delay the, the region knows a little bit to make sure that they're all fully consistent. Yeah. Right. I have a reclated question. Okay. So suppose I want to use this for financial transactions. I need strong consistency. How, uh, how consistent are the backups? How consistent? Like if I take a backup of the whole distributed cluster, um, is there the possibility that different parts of the cluster will be at effectively different points of time when they get backed up, or do I have a completely guaranteed, um, stable snapshot of, of, of all nodes? It's going to be completely stable snapshot because we are only taking the snapshot with a completely stable and consistent data as a backup. Okay. That's, that's cool. More, more questions. Yes. Yes. Question, please. Yes. Hello. I just want to ask, um, how is it comparable to like, like a CRM? How is it comparable to what? Like a CRM, like a customer relation? We can definitely power CRM, but but CRM would be like, like, like, like so important and so on, right? So, sorry. Are you asking, um, how easy it is to run a CRM or high form on it? Oh, very, very easy. Because if you can run it on MySQL, you can run an ocean base. So what's MySelling for it? MySelling is, is, is, it's distributed transactions, financial transactions. Yeah. And it's not, it's not very common for relational database that's distributed, scalable, and you handle financial transactions. Yeah. Yeah. So it's, it's, it's commonly like, you know, the, the, well, for financial transactions, I would use Postgres for that, uh, because it's pretty good. Yeah. We use Oracle for that, the SQL server for that. Uh, but those are monolithic databases. They are not designed to be distributed. Yeah. So, next question. Well, yep, you can follow me or download it and give it a try. Oh, there's, oh, there's more. One more question. Yeah, I have question. I think I read it somewhere that, uh, you, uh, have ocean base and then you'll have PolarDB, also from Alibaba. Oh, right. And I think there's also something called, uh, in memory something, I forgot the name and, uh, and what are all those differences between those? So, so, um, so first off, uh, we, we are part of the ANG group, right? Uh, we are, we are kind of a sister company with, uh, Alibaba. So Alibaba itself has also their own database technology as well. So think of us also in AWS is the same. So AWS also have their own managed database services, right? Uh, we do, uh, from a commercial side, you know, we, we do, uh, sell our services on the ID crowd, right? We also sell it on AWS as well. So we have multi crowd. So the difference is that, uh, for all these different services, whether it's PolarDB or whether it's Aurora, they're always specific to a particular crowd. So you can only use it on that crowd. You can't really use it on somewhere else. So where else we, we are kind of different. We, we, we want to be, um, uh, be a standard for distributed database. So you can just use us on anywhere you want. Um, we, we are more like, uh, we are co-co-petitors, if you will, right? Not really competitors, but at the same time we are, right? Because we are, we are, we are competing in the same space. I have, okay, I have one more question that I'll hand it over here. Um, let's talk about, um, like rebalancing data because, you know, uh, sometimes with some distributed systems, they tend to be a little static. If you're writing a lot of data, maybe it almost ends up going to one node for one reason or another. Um, how are you able to rebalance things if, uh, if one node becomes a lot more full than another? Uh, if so, kind of could you explain that a little bit? So, so it depends on how you want the database to behave. So if you, if you let the database do the automatic, like partitioning, balancing, so the database would constantly look at the different nodes and say, ah, you know, am I hitting one of the nodes really, really caught? But sometimes you got hot, hot table problem, right? Where, you know, one particular node because of account transactions or a region, let's say in Singapore, you know, orchard is suddenly really, really hot, right? Then that particular node suddenly gets a really hot usage, the CPU, IOA, all go up. So you can manually say, okay, that, for that node, you want to split the data and it will automatically be done, automatically by the system, or you can actually say, you know, if the system monitors that and says, ah, you know, there's a hot, there's a hot table issue here, you know, we're going to split the data as well, you know, so that can be done by the database. And that's why we kind of not having that problem, because even if you use something like a sharding, you do something like done over db, there's executive problem with done over db, if you hear the, you hear the table really hot, that, that partition dies, which is a common problem, yeah. So, the question is around the age computing, right, ah, question base and age computing, do you have any, ah, use cases that you have seen in your experience? Ah, actually, most, most of my customers wants it to be in country, right, so, so like, if you, if you go to Malaysia, you go to Indonesia, they, they actually, they don't even want to talk about age computing, is that you need to be there, ah, it's either you're in the data center or you're in a public cloud that's in the country, and if I have customers says, I need to deploy on the age services, which I'm talking to the age, age computing service providers as well, you can deploy the database at the age, right, it's not an issue at all, yeah. Okay, do we have more? I love it. Yeah, I'm sorry if it sounds like two questions, Susan, one, but first of all, may I ask you about requirements for the ratences between different availability zones, and do support multi-regional installations? When I say region, I mean like all, all what I think says, one is Singapore, one is Japan. Exactly, exactly, Asia, Europe, etc. different continents, we don't, well personally I think that's an impossible task, that's one, second is that we've been trying to, we actually do that, so we've been trying to, you actually do like in different, different places like Beijing, Shanghai, or Guangzhou, which is pretty far, anything that's on fiber in under, under 50 ms, like the latency must be under 50 ms, so there are 50 in the seconds, these are below 50, 50, 50, we don't 50, 50 ms, oh 50, 50 ms, no, all right, yeah, we don't, yeah, I know, I know where you're going with that, I know where you're going with that, so a lot of these distributed systems, we use synchronization, we use clock synchronization, in fact our clock must be like fully synchronized, even Spanner needs to do the full synchronization as well, and they have one of the most accurate clock on, on the earth, right, Google, Google have to, we're having to do it, we have the same requirement, because we are distributed systems. All right, and we have time for one more question, I don't know if you mentioned before, but I'm looking at the licensing, did you mention the licensing, because this says Mulan 2.0, we're not, there is a little bit of Chinese versions, okay, first off, great question, I'm expecting that one, I, I asked the same question, because I, you know, I started three, just three months ago, with them, so the Mulan license is very similar to GPL version 3, right, so, it's an open license, you can use a code, but your own code needs to be open as well, it's as simple as that, and you can include the Mulan license in your own licensing as well, it's kind of like what Spark 3, Spark is doing, so if you look at Spark's licensing, it's that, oh, you know, I'm licensed under Apache license version 2, but then these are all the open source licenses that I have as well, so, so Mulan can be included that way too, yeah, so no issues. Second, okay, thank you very, very much for this wonderful talk.