 Hello everyone. Good morning. Good afternoon. Good evening. It depends on where you are joining in from. Today I'm going to talk to you about changing landscape of open source databases. And I will look at that from two different angles. One is from their changing the open source, definition of open source, open source business and so on and so forth. And then changes which are taking place in the database technology. Now, in this presentation, I will use free and open source software interchangeably. And I know this is not quite the same thing, right? And there are some people who are very passionate about using one terminology or another. Well, I'm just going to miss a match. So forgive me for that. Now let's start looking at some history. Where is this open source software is coming from? If you look at the early days, hardware and software, they're actually bundled together. And software code was shipped with that early software, right? So the early adopters could modify the code to fix the bug and add functionality as they need. And those changes were also openly shared according to the academic principles of sharing the knowledge, right? That wasn't quite called open source at that time, right? Even free software, those terms came about much later. But in essence, that was something like that. Then if you look at the 70s, there are a number of things happen which created the proprietary software industry as we know it today. First is we have a computer software became a copyrightable item, at least in the United States, right? It wasn't on the list before that because software code wasn't the thing. And also IBM is forced to unbundle their software and hardware as a response to the antitrust. So software was copyrightable item which could be sold separately. And it really became a major class of intellectual property at that time and still remains today, right? And there have been a lot of innovation happening in the proprietary software. The next era which I would touch on is something I would call it as an era of romantic open source software and free software in 90s, 80s, right? You can see here on the picture that Richard Stolman who was the early leader of free software foundation, right? And he was championed for free software. A lot of the idea at that time is what the open source software or free software is good for you. It's kind of good for the planet. It was the right way of doing things. It was not so much about making money, right? Or even helping the corporations to make money or stuff like that. In 2000 though, the open source really gets a lot more mainstream adoption, right? Which you can here recognize but folks like as Steve Ballmer from Microsoft calling Linux cancer, right? Obviously Microsoft is now moved away a lot from that position, right? And one of the leading companies which works with open source and contributes to open source those days. And also their successful exits with MySQL being acquired by Sun for a billion dollars which really sounds like a lot of money at that time. Before this acquisition, the Red Hat was really their exception kind of one company which was able to IPO and really maintain the value over long term. This is also where the enterprises recognize the value of the open source and it becomes a preferred choice for many. So why and how open source is important for enterprises? Well, it offers direct lower cost in many of cases, of course, and it also for many engineers, especially the good one, it's preferred to commercial software. Because if you're a gifted engineer, you're likely to prefer open source software because you can understand it better if you need. You have a source code available to figure out how it works inside and why whatever you're doing with that software works or doesn't. Better productivity, faster innovation, and also what is important you avoid software vendor lock in which can be problematic over a long time because as software become more complicated than changing some foundational stuff as database or operating system can become very, expensive and creating that sort of unbalanced relationship between the vendor and a customer. What happened in this case, though, is also the new generation of open source companies was born, which really started the focus on the business first. Hey, there is looks like a, you know, great idea we can have an open source solution for X and make a lot of money along the way. The fastest way to do that in many cases is to raise some venture capital right and a lot of open source. Those days is venture funded. And what that means of course is what you need to provide the high returns to their, your financial backers and do it quite fast. And that create I think this very interesting dynamic right there are very companies would use their attractive messaging of open source to obviously attract more attention because that's preferred by so many companies but at the same time focus on a building monopoly to avoid commoditization increasing stickiness, building anti competitive modes, all those things they teach you at the business school right as a ways to build a successful business. But in many cases, it's kind of who comes in the conflict with the classical romantic open source software value. With that, something I will call not quite open source got a lot of traction right and indeed majority office software, which is marketed or casually referred as open source will fall in one of those buckets open core software open source eventually, and so on and so forth. Let's look at my squirrel for example, my squirrel is often spoke about as, you know, the world's most popular open source database. Right, but in fact my squirrel is open core, there is my squirrel community edition, which is open source. It is however missing certain features which are reserved for the enterprise edition, which is commercially, commercially licensed. Right, so that means you cannot get all the features, which you may need in open source solution only in in the proprietary one. The other approaches would be open source eventually that's when you have a software which is released under property license and then after certain number of years goes open source, you have shared service licenses for example, MongoDB uses SSPL license which restricts competition from from cloud vendors and so on and so forth. And there is also an interesting class of software which is open source compatible software. I would like to focus on that a little bit a little bit more right. One thing I like about the open source compatible software what typically this software is doesn't claim to be open source right it's, you know, honest property software, which claims to have some level compatibility with open source software. The important thing though is what you understand what this compatibility typically means, right. And I like to call this comparability as a hotel California comparability which makes it very easy to check in, but kind of not so easy to check out. The open source compatible technologies often very much focused at making sure it's easy migrate from open source technologies to them, but then provide some additional features, right, which you would likely adopt and then make it very hard to move back to open source software. If you ever need to keep this in mind, and if you're using open source compatible software and if really ensuring where you are not locked in is important for you, make sure you test your software with their open source software in question. For example, in a database space, if you're running Amazon Aurora and want to ensure you are software can still run on a plain MySQL, make sure to test that, not just hope for that. Now, if you think about that's not quite open source software, it's kind of has an interesting impact to the open source movement in generally on one extent it really allows much more investment in the high pace of innovation in in the open source space, then just a bunch of romantic folks would be able to provide, right, but at the same time, this software does not provide all the value of a fully open source software, and even worse in some cases may mislead people and erode the value of understanding of the open source software as it is. Now, if you move to the closer times, 2010, so we have a raise of a cloud and cloud obviously brings a lot of unique challenges and opportunities to open source software. I think one very interesting effect of a cloud is what before a cloud many companies would rely on GPL and dual license to prevent folks from building commercial derivative of their work and monetizing that without giving anything back. Right, for example, MySQL had a very successful dual license business where if you GPL doesn't work for you, you want to build commercial entities, the commercial derivative you can do that but you have to buy MySQL under different license to do that. With a cloud, you do not actually do software distribution, so you can use a GPL, a modified version of GPL software without paying anything to anyone, right, so for example, Amazon Aurora, there is a few other cloud vendors run derivatives on MySQL and they don't make a lot of money on that commercial derivative without having to pay anything to the MySQL copyright holder, right, and that really broke a lot of business model and really forced a lot of the companies which are really capital heavy and really need to focus on providing, you know, Harry John, right, to their backers, they had to change the licenses away from the open source to the licenses which are not quite open source but really protect them from being able to destructed by cloud hyperscalers. Right, they are, of course, within their right to do those things and other businesses, they are responsible, right, to their shareholders to do that, but we just have to mindful that those changes means that there is less of open source software available. Another interesting thing, what have been happening with the cloud is what I would call the great rebundling, right, if the cloud services, you now often have your hardware cost and usage costs kind of mixed together, right, for example, if you're buying Amazon Aurora instance, right, it's not really separated, obviously how much I pay for software and this is how much I pay for hardware, and that is problematic for open source because you don't have a zero price effect anymore, right, there is something very much magical about price free which often takes us to, you know, spend more time, right, even or effort working with a free solutions when even if they are paying, you know, just a little bit of money to get them. Additionally, the most convenient and easy way to adopt databases has been becoming the pattern called the database as a service, and it is fantastic and I think that is where the database moving at large at the future, right, and the added benefits or interest and benefit it provides is what that really allows developers to choose more database technologies to meet their needs like an experiment more, because in the past, you know, before database as a service, if I want to deploy this kind of a new database and use it in my software, I would need to make sure that Opsteam is able to provide all kind of services, keeping it at 24x7 security, patching policies, so on and so forth. If database as a service, often that can be outsourced to a vendor, and you are as developer, you know, just get a database instance you can use. And I think that is one of the reasons why we see really so many companies using increased number of purpose built databases, open source and not for their applications. Another thing which I think is quite interesting about the database as a service, well, as many things, it tends to be over marketed, right, it does provide a lot of fantastic benefits, but its promise is much higher than that. We find a number of customers which come and say well, you know what we kind of, we are sold on that kind of fully managed database as a service you don't need, you know, any database on what's not but then we figure out you actually need somebody to understand the databases because the database as a service is not going to design a schema for you, right, or tell you how to rewrite the queries from bad queries to good queries and stuff like that, which is one of the core functions of your DBA team if you have that. And we also have found the database of which do not have a lot of database knowledge and understanding, right, they often would have some additional problems when it comes to database as a service usage. So, where it has been in the last few years, a lot more cases of security incidents because database have not been configured appropriately, increased level of, you know, down times, for example, because of lack of appropriate capacity planning, you know, and so on and so forth. Right, so keep that in mind, if that's something you are thinking about. Now, if you think about the cloud, I think right now there is those two different approaches, how cloud can be used and they are in use at the same time and different teams, think about that differently. Some think about cloud as commodity and, you know, compare that to something like electricity or your internet provider, right, they are not so very much differentiated, you can switch one to another relatively easily, right, and this means you leverage a lot of power in negotiation, right, if you would have one. Right, in this case, you are using many of the compatible implementations, you know, maybe using S3, right, for storage compute instances. In many cases, those days, that is where Kubernetes is being used as a, you know, cloud neutral API, right, and that really gives you a lot of flexibility picking and choosing cloud, but it's not maybe as polished or as effective. And that is where the other approach comes in when you say, hey, I am going to use proprietary solutions available from the vendor, right, to build my applications as quickly as possible. I would use database technologies such as Amazon Aurora or even DynamoDB and so on and so forth, right, and a full stack of those highly differentiated proprietary technologies, which will possibly allow me to move faster, but comes off with a risk where if I have to adopt another cloud, it will be very inexpensive and impossible. See, though, from our side, and again, they're quite biased in the open source space, what nobody really wants to be the host, and with the trends with database as I described, you see the pivot happening towards what you can call the multiverse, multiple database technologies are used in multi-cloud and hybrid cloud environment, right. For this ecosystem, you can see actually a lot of proprietary solutions are available, right, to run multi-cloud and hybrid cloud, all cloud, big cloud vendors as well as companies like VMware have some solution in this market. Additionally, we have a Kubernetes emerging as this kind of leading open source alternative which works on any cloud, being that's public cloud or private cloud. Think about the open source databases, how do they evolve in this market and what they should do. One thing I think they should adapt for cloud native deployment in multi-cloud and hybrid cloud environment, right. This is different compared to on-prem deployments from many dimensions, right. The second is what the Kubernetes API is really an API of choice for many open source database deployment, right, and we see increasingly standardization happening in this place. What I think is still missing right now is focus on simplicity, right. If you really want to have an integrated database service solution which is similar to Amazon, RDS, Aurora, Google Cloud or scale and so on and so forth from terms of simplicity, it is hard to do, right. I don't see, haven't seen the open source solutions which are at that level yet. So, in your case, I think as you're choosing database as a service, you may not be able to get everything from open source yet, but at least ask a question in this case, how do you get from most from open source. From our side at your corner, we are really working hard to push boundaries of what open source can offer with our products per corner monitoring management which is your kind of GUI for monitoring, right, in the future deployment and management and also the operators which really allow you to run the databases in Kubernetes environment efficiently. Okay. In the last 15 minutes or so, I wanted to talk about the database, database technology and the changes are happening in this space. If you think about the brief history in the database technologies, it's kind of interesting in terms of what everything old is now new again. If you look at the very early days of the database technologies, you know, 60s, 70s, there was a lot of different models, implementations, languages, right, and a lot of fragmentation. And there by 80s and 90s, we have a lot of standardization happen, and pretty much complete dominance of relational database and SQL query language. There have been different database, of course, in different vendors, but the big language decision and so on and so forth was very much unified. So if you look at starting from the thousands, we have a new wave of innovation, which both applies in terms of data models, as well as the query languages. So what trends do we see actually right now as we are starting our 2020s? Well, one is what we see developers and architects are empowered to make more choices related to the database technology. And one of that when it comes to one of the drivers that is database as a service, that means developers can actually, you know, choose technology and have that run for them without needing another team of ops people to commit to that, at least in certain environment, right. So cloud makes using those multiple databases easy. And now, which is interesting is that microservice architectures, right, or even if you don't go all the way to microservice architectures, right, kind of decomposing to the monolith to more building blocks that often means that each of those building blocks have its own needs and it shows choices for a data store, right, and that's how multiple technologies may be adopted. And that also brings us the term of multi store where in so many cases you would have the same information which is stored in a different form in multiple systems, right. For example, one may use, let's say MySQL as your database of record, right to record the, you know, orders and so on and so forth and then shift the data from Kafka to elastic search for full text search needs because it's much better. You can use the same Kafka to ship the data to click house or redshift to really have a, you know, very fast analytical workloads right with a column store. One question you may ask is, okay, are we using this should be using relational or non relational databases. And I think it's interesting in this case it's even now, right, while we have a lot of other databases come to fruition, the relational databases are still completely dominating, at least in terms of a general purpose databases, but at the same time. If you look at the growth rate, what you would see you would see things like time series databases right or document database and so on so forth they all grow a lot faster than the relational databases. Those days right so a lot of those special purpose database are growing much faster. So there is two innovation which happens if a data data model one and one approach is to break with relational data model entirely right that's how many companies did, you know, thinking, you know, Cassandra, MongoDB, Redis, and so on and so forth, or extend SQL. Right, and that is also what have been happening in many SQL technologies you know think about my school or postgres or SQLite even all of them have extended their relational SQL model to process the document data such as Jason specifically better. So I think which I see quite interesting. This is another kind of dual trend where two competing approaches exist. One of them is multimodal databases that is when you have one database which can actually be used through a different protocol using different models. So around good to be supports many different models or even my school while it doesn't market itself as multimodal database it has both SQL interface and also dog store crowd protocol as well. Right. And then another one is hybrid transactional analytical databases versus multi store. One approach is to say hey you'll have one database, for example, the, you know, pink up study be and many others which will really be very good at both running analytical queries as well as your transactional queries, where another approach is saying hey you know what we will use one database technology which is transactional optimized another which is optimized for analytical queries right and the data will flow within them. Both approaches are currently used successfully and in that will be very interesting to see if there would be some call they call is somewhere in the future or we'll have them happen at the same time. So most of scaling. There are two scaling approaches which come to the databases. One is traditional land from older days is scaling up right if you think about a couple of decades ago right if you needed to scale your Oracle instance you often would, you know, buy even bigger products right and run it run on it. The scale out approach means instead of that will spread database across many many systems right often it also can be referred as sharding the MongoDB Cassandra, you know, you go by planet scale with us right they're all designed if that scale out in mind. And it is no question what if you want to build a huge scale applications, you know, think Facebook scale right. When you can't really scale up there is no single server, which can run Facebook workloads right in in existence. Right, but the question comes is, well, do we need both. Right, do we need both the database which are optimized for working very efficiently maybe for medium size databases in the constraints of a single server, or should we be only looking at those distributed database which tends to be kind of more complicated and have some different characteristics. A couple of architecture trends to which are driving I think in the in the decisions for databases right now is the locally distributed right that is pretty much your shorter database right there are a lot to allow to scale out. But another interesting trend which many people, many use cases demand right now is geographically distributed when you can say well, I need my database to live in many geographical regions, which can be because of their performance reasons. It can also could be because of illegal reasons, you may have a local government saying hey well the information of my users needs to be stored in my country, right and if you want to have all your users in a single database well you'll have to have your database kind of to be geographically aware. We have a lot of war going on with the cloud native and communities focus databases. And at the same time, also, many databases being built as a cloud only databases which are available only in the constraints of that given property cloud, you know, think about cosmos dv Dynamo dv which I answer mentioned already. Another interesting trend is separation of storage and compute, which a lot of database are thinking about or at least pursuing because they're, you know, very seductive benefit of this approach is what that allows you to scale storage and compute separately. And it allows you kind of to have your compute stateless, right, which, you know, brings many other architecture design benefits, though, that is not something of what traditional databases like my school postgres or even, you know, Mongo has been designed for. Hardware acceleration isn't a very interesting, interesting trend, right, we see many analytical databases, for example, figuring out how to use GPU successfully, there is also new generation of storage coming in which can accelerate some of the database, the operations and so on and so forth. Well, as a summary, I wanted to highlight what there is a lot of going on with in the open source database space. I think that is the great time to be involved with open source databases, there is a lot of fun things to do and as well a lot of career opportunities in which comes with open source databases. Right. One thing on my personal level, which I would encourage you all to do is to what extent it's possible, keep the open source open and for benefits of all of us. And with that, that's all I have for you folks. And feel free to reach out, do you have any comments or questions?