 This is my take on the topic SQL, there's no SQL. I'm Jens Wilken, and I go by nick the prep tags. You will find me on Twitter in the top. And I also blog sometimes. I'm a public source developer. One of my projects is cache2k. It's a Java in memory cache. And I also run a boutique software shop in Munich, which is called Head Issue. And I'm involved in a couple of startups. My database background, besides a lot of other things, like around for a couple of years, I wrote actually a Java database engine back in 2004 for a German setup box. Yeah, without any SQL, so it's no SQL. So that didn't exist by that time. I wrote two ORM resistance layers. I did some database-related fetchleteases, and I'm taking the University of Munich, the professor buyer who invented the entries. And I'm a happy, prosperous user since 1999. So yeah, why eat these presentations? Because I'm involved with startups, and I see a lot that they make interesting choices because they don't know better or they go for obvious, modern stuff. And sometimes things go the way that they're supposed to be. And they struggle with simple stuff. To run on no SQL databases. So yeah, this is my condensed things about what you should keep in mind. Of course, I cannot go into the very detail. And actually, this is about opening your minds that there are a lot of details to think about. So why no SQL at all? Of course, the first thing is the scaling issue. There's this internal code that scale, which is massively scaling things. So whenever the startup is making their initial architecture decisions, they go for all we need. So of course, there's this special property stuff like document stores, type series, that more fit to a special usage scenario. And there's this other thing. SQL sucks. Yeah, there's a blog post that I found that goes into detail why SQL sucks. And the funny thing is at the bottom of the talk, you have lots of comments. I picked two of these comments. The one is, oh, SQL is fun to learn, it's so great, whatnot. The second thing is, I hate SQL. I can write HTML and JavaScript, but SQL. It's like someone decided to get a genetically engineered gorilla the opportunity to write code. And there's this other thing, that's the impedance mismatch. So there's the relational model, which actually fits to nothing. We write our code or structure, we have objects in the class model. We have JSON data, we have XML data, we have graphs. And this relational model actually doesn't actually fit to any of it. It's not meant to fit, it doesn't fit, but it doesn't fit. And there's the other thing. Everything should be simple. No SQL, SQL, you can just store the data and put it into the database. Okay, I say SQL is cool. Yeah, you have all these issues, but okay, much of the misunderstanding comes from a different approach of language. It's different from programming language, because the programming language says, okay, it's imperative, and you go sequence it sequentially and say, do you compute to this, to this, to this, and SQL is different, SQL is a declarative style. You say to your database, this is the thing that I want to know, and the database decides for you what it's doing, actually to carry, to do its operation. And well, I know there's a lot of legacy in the SQL syntax, which makes it really, really hard. There's a lot of legacy in every kind of stuff, like the Unix to define is a nice example. So no SQL, really. So I think the term no SQL plays actually by another term, because on databases, actually, SQL is that or comes back, because at some point, you need the programming language that everybody understands, and that has a common standard. So there are a lot of modern database systems which are distributed, which have a special purpose that actually also use SQL. And there's another thing that also the languages that are invented for no SQL databases actually look a lot like SQL. And I'm not so happy with that approach because it's a lot of very error-prone and very hard to understand if you have a lot of languages that look similar, but are not. Okay, so yeah. When you want to have full tolerance and scaling, yeah, we are talking about distributed systems. And in the year 1998, Eric Brewer came up with the cap theory to get better understanding what are the actually problems and limitations when you do a distributed database. And what he said is actually that there are three things that you maybe want to achieve that's consistency, availability, and partition tolerance, but you never achieve all three of them. Gotta pick two. And here is like a short try to put the existing stuff into a category so the traditional database stuff is more in the CA like you have consistency and availability, but no partition tolerance. Just to try to put the databases into these categories. I took this actually from a blog post from 2010. There's a lot of discussions where to put actually the stuff and it's quite funny to see that MongoDB is like in the AP app and yeah, more about this later. So there are a lot of details with distributed databases, no SQL databases. The first thing is there are a lot of smart algorithms theories and then there is like an idea that's something in 2008 came up with some better photos. There's a lot of ideas about what actually means consistency to you and how it is actually in the semantics of your database that can be very, very different. Oh, consistency. There's a guy called Kingsbury, which is doing the Epson reports. This is actually a test suite that he runs continuously on different kinds of no SQL databases and he finds interesting things about their better assumptions about consistency over Corona. And in 2017, in February, he run his stuff on the MongoDB that was coming up and I just randomly picked some stuff here and he says like, yeah, the MongoDB means zero replication protocol is in principle unsafe allowing the loss of majority committed documents. So actually this is a good thing because somebody is actually looking into the stuff and makes in-point problems and MongoDB invested a lot in fixing those problems and formulate some ideas guarantees that you can have on consistency when you use MongoDB. So I think this is kind of cool and ground cracking stuff that MongoDB is doing under no SQL vendors trying the same, you need to find out by yourself what consistency levels you get and how the database behaves. It's not so structured. Yeah, a wrap up on this. So you'll find a lot of scary stuff about your famous no SQL database in the Jepsen report. If you don't find anything, then it's even more scary. Rumble in the no SQL paradise. Oops, we got to earn money. So there's this example, right? It's a distributed key value database. The major driver was a company called Azure Technologies. The initial release was in August 2008 and in mid-2017, Azure runs out of money. Last year, they made their first release as a system with this energy so it was able to take over the project by the family. Also, in my own database stuff, I always look at the weekends and say, oh, they're doing full stuff and they are doing good. But yeah, they went quite well until they went out of money. Okay, normalDB. It's actually a quite similar timeframe. They did the initial release in 2009. In 2017, they did not went out of money. Instead, they went IPO and got more money. In 2018, they had a total funding of 209 million and a revenue of 154 million and actually a negative cash flow of 47 million. So I'm not good like in reading financial reports, but I think they are still not on the positive side yet and we need to see how the story goes. In October 2019, they decided they changed the license because they say, okay, people like Amazon are actually using their products without giving back. So we have the license now that forces Amazon to give back. But what happened is that this is actually not an OSI-improved license, so they dropped out of the major inaccessible fruit. So there's a lot of discussion about these licensing and whether it's actually an open source license or not, whether it's allowed to get OSI-improved or not. Here's my take on this discussion. It doesn't matter actually whether it's OSI-compliant or not. So the more restrictive your license is, the more you are killing the community that may or may not develop around your core product. And you have less adoption, you have less thinkers, you have less innovation in the core product and the chance that the product will disappear is quite high because there is no interest by the community to invest in the product or maybe to fork or take over the product because you can't change the license. Another thing is that there is culturally consumer elastic search. These are in the ASF or the National Software Foundation area or are Apache licensed. So CouchGD and Cassandra are actually Apache Software Foundation products and elastic search uses heavily Apache stuff as their database engine underneath. So for example, Apache Business Station, index engine, elastic search is using. So those three I picked out to show you okay, this is actually at the other side. Those three products are heavily backed by one company but they are on the other hand rooted in the Apache community. What about the old guys? Yeah, PostgreSQL, 1996 initial release. There's a diverse community. They have support for JSON, they have support for XML, they have solutions for high availability and scaling and all that stuff. But there is a lot of fragmentation with these options and there's different add-ons, there's different companies driving to different directions. Unfortunately, there's no relevant JSON test if you want to know about your high scalability solution with Postgres that would be interesting. But no, but I think you cannot have diversity, a diverse community without any fragmentation. So it's not possible. MySQL, yeah, that's the picture of their booth down there. Yeah, of course, they are getting into the same direction and commercial vendors like Oracle and IBM DB2 or whatever, it's still interesting. Yes, okay, they do the same. So they add support for JSON and other things so that you can just store your database in your data format and then let the database know how to understand it. Yeah, they are different directions of database. The first one is the universal, rational approach, the traditional approach with SQL, for example, Postgres and MySQL and they're more specialized stuff. Bycom stores, supplement stores, feedback stores, time series databases, graph or triplet stores, search engines and streaming databases and there's no single reason to go for one database in a mile before somebody says, you need to be high scalable and you need to be super fast but there are a lot of different other reasons that you should consider when choosing a database. How many committers are on the whole project? How does the community look like? How many years is it around how mature is this stuff? Like what I showed early on, the report from from Yepsen about MongoDB was from 2007 to 2008. I discovered major issues in their consistency. It's just two years ago. Yeah. So this is my takeaway. I think every developer should know about SQL and have a very universal database in his or her toolbox. It's not good in anything special in particular but quite good in almost everything. And whether it's MySQL, Postgres, or the commercial guys, you can tune them in a lot of ways in the direction you need if you have any special need on scale, you can do that if you want to have any special indexing for geographic stuff you can do that with Postgres and this extension. You can do a lot of things and MySQL is great because now there are even more databases that are specialized in a certain area and go that as a second step and say, okay, now I need I now have an application that has this special need so then there's a lot of nice databases to choose from but then you should know about their semantics in detail and their limitations so we're really tuned. That's it. Thanks, enjoy life.