 So Alex had something that came up. We can't join. So Thank you for having everyone add their notes to the agenda. I think what we wanted to cover today was to go over the template that Luis was working on if he happens to join and then also the database paper The so who was working on and and if there are other agenda items that people want to add Please go ahead and add them here. Let me repost the link on the chat For the database Yeah, yeah, I had it down here further. I was gonna grab it as well. I'll add it to the agenda Okay, and did I print your name right? Is it Sugu? Is that? Yeah, so good Sugu or Sugu And is Luis on Wasn't sure we'll be able to do the second one. I think he might be on Portworx webinar or something today. So I think we can go ahead and get started. It's five after We all want to Not sure I can record Does Alex generally record Steve? It's already recorded It already started it all right. So let's can everyone get into that white pop white paper document Actually that That document I don't think has it. I wrote a separate Doc for to be later added to the document. Let me get the link for that. Did you put it in the chat? I don't see it. It looks like that doc Requires permissions that I don't have I'm trying to get the link Okay, if you want to put it in the agenda You have edit permissions that'd be great. If not, just put it in the chat and I'll add it there. Where was it? Here and I can only stay for half an hour except catch plane But Quinton should be here by then and he's gonna take over for the second part of the meeting Computer is refusing to move forward. Oh, no Well, we can go over these Process reviews for the TOC if you want why you're doing that. Yeah, I'll just take me only a minute Okay Do I mean to wait for you or? I'll open it soon There you go, I found it great Oh, I don't have edit permissions Okay, just go ahead and put it in the chat and I'll add it to the document then yeah, the white paper link opens for me Another doc the hero separately. Yeah, I just Here it is Great. I've added that to the doc. Can everyone get inside there? It says it's the same thing database edits So it's a different copy of it. I'm in there Yeah, I'm in Okay, everyone else can get in I Think generally a common problem with these sigs and working groups is that the permissions generally are tied to the mailing list of But if you have multiple Google accounts and end up logged into the wrong one, that's a common problem for not being able to get in I think the permission is restricted. It's not like I cannot write To that doc the agenda doc Yeah, I think that's locked down. I had to have out any direct permission. So I Believe you should have full permissions on on this. I made it. Yeah, okay. Yeah, it looks like I can Okay So pardon me. I wasn't able to make the last call Did we get partially to the stock or do we want to start at the top? I? I this is the first time. Okay. Yeah. Yeah, the the one thing I did was Because Key value stores and databases have similarities those parts. I'm not covering. I just said look at key value stores for the trade-offs so I've only So it's kind of additive The stuff that I added here is things to consider beyond what you would otherwise Think about when moving a key value store into Cloud storage Okay, sorry. I'm just reading through that Do we want to say? The every database vendor is required to provide drivers that conform to these APIs. I mean, you could be if you want to be cloud native and Have the interoperability, but This is just the old Yeah, the old on the jet the ODBC JDBC That kind of stuff. Actually the newer database vendors I don't know if they do provide these things like I think Spanner for example doesn't I don't think it provides any standard. Yeah, that's why I think that's Maybe an inaccurate statement. Yeah, you can say is Encouraged to provide drivers, yeah There you go Okay There we go. Yeah, that's much nicer way to put it. Okay and do we consider No SQL databases to be Database them because this one with the heading of databases were yeah, the key value stores are the no SQL ones So this section section nine in the in that document and Actually, I later noticed that Yeah, tidy be might have been mentioned in the section nine. So we should remove that because Ty KV would be I Think it wasn't tidy be it might have been cockroach. Let me check Are you looking back at the old document? Yeah, I need old document In section nine is a big document. Yeah, so in section nine where did I see? Yeah, so Spanner cockroach Ty KV fauna DB provide distributed key value store API So I believe Spanner and cockroach DB are more databases than key value stores So you think they still belong here. I'm highlighting in the dock if people can see that Yeah, so I wasn't sure whether we were restricted to just open source or Anything that's considered generally popular like Popular nowadays, so I was I wasn't sure whether cockroach DB should be mentioned But then I saw it mentioned in section nine But that might have also been written before cockroach DB decided to change their license It's not considered open source anymore by many standards. Okay, I Guess I'd like to hear what people's opinion is on that I mean, I would think we would probably limit it to open source Just because that's Would be the view of the CNCF community But I don't know if we're just trying to cover generalized I think if you do that you should change the title of that section 10 to open source databases just to communicate that That's a good idea reduce the scope of coverage Because there are people who choose to put even commercial software in containers you know it I Don't think the CNCF is going to accept a non-open source as a project or Promoted aggressively, but still the user community Sometimes is in search of help in terms of how to do them even if it's only to support a legacy application yeah, we had a similar discussion in the context of the white paper last year and The topic came up in a context of like object stores and people were wondering whether we could mention things like s3 Etc. And the decision we came to at that point was that it does make sense to mention the non-open source stuff Particularly where they are like household names in a particular category in Oracle database would be one perfect example, of course And that we don't you know pretend they don't exist or anything But we don't necessarily go into great detail on them and focus, you know more of the attention on the open source one Makes sense It does and not only that there there's been sort of a few instances of people converting from open source licenses to other licenses and if it would be You know a mission to try to keep up to date on all of these and then go back and retroactively edit documents Well Yeah, so we should add them it to say open source databases and I mean cockroach is a perfect example of that Steve. So maybe we delete it from here Yeah, that or we can do we can just leave it as databases in which case we should add spanner Some people that also talks about Things like Aurora as new sequel, but I don't know if I agree with that point of view Because it's not but definitely not cloud native so Sorry, I joined the conversation dates I'm lacking a little bit of context, but I think I think where we landed last time with a similar conversation was essentially if there are Names of commercial things or things that are not open source or things that are debatably open sourced That we omit from the paper that causes confusion and so therefore we should mention them And if there is you know debate about whether something is open source, we should just say you know it is not universally Agreed upon whether this should be considered open source or something along those lines if that is the case and Definitely, you know cockroach DB is is you know verging on a household name And I don't know whether the debate was whether or not that is open source, but Irrespective of whether it is it should be mentioned on the basis that it's an open that it is Household name in that category Or we could change it saying that The I guess yeah, that's true I was thinking maybe we can call it out that cockroach DB is source available But I don't know if that and then we go into explaining all these differences, which is not the focus of this paper Yeah, I wouldn't get to hang up about the definition of open source I think that is a as you say a topic for a different discussion and we could put a reference if there is a sort of reasonably good Reference to the debate we can put a link in there and say that you know Depending on what definition of open source you use some of these may or may not be considered open source something along those lines I think it's better to Do not be that strict and what I would prefer to do instead is just add spanner also into this list So you can say with this Spanner tidy be and cockroach provide relational features and just leave it at that Should we just had like add a one paragraph kind of categorize those two like open source ones and the non open source ones so that Just at one place We show whether they are open source or not I think people will figure it out when they go look for it, right? They realize that okay tidy bees open source not cockroach TV Tests is open source the span and no source available Think those I would I would suggest against that I would I would make it very upfront here So I don't think we should expect people to go digging through websites to figure out whether or not things open source I think that's Important enough we should put it right here in the paper How are just an appendix at the end or something that keeps track of this I Don't like the idea of it scattered throughout the document as each thing comes up because it's tough to maintain Do What what is the What is the thinking against making a distinction in the document is the problem that there isn't that it's not Oh, I don't mind putting it somewhere in the document But if we do it on every instance of a name of an open source Entity that's a pain to keep up if we do it on the first one. It's still tough to find them all So perhaps an appendix at the end Yeah, or what no maybe Yeah, just a crazy Thought I had was you know, maybe just change the typefaces, you know Non-open source stuff in italics or something and so that it's very clear when you're reading the document when you Refer to a non-open source or an open source one just a thought Yeah, or or just some sort of footnote symbol that the footnote says You know, not a standard open source license or something like that. Yeah Anyway, we've probably beaten this one to death Okay Shit So in which case we are we going So we are are we leaning towards keeping these names For now and then qualifying them if needed Yes, I think we're gonna be So we should change the title back to just databases then yeah Done and I will add cockroach here for now and then we can You know the one of the easier thing maybe is Provide links to tidy be and car we test and no links to cockroach and sorry not cockroach spanner Provide no links for spanner or cockroach Sorry, I came in a little late. What document are we talking about? Oh? It's in the chat Luis it's in the agenda as well Here Yeah, tidy bees Apache license to I think they are pretty clean So we can maybe provide links to with this and tidy be and no links to spanner and cockroach What's the thinking behind some with links and some without? the fact that there is Cockroach TV is not considered open source and there is no and spanner is not open source. I Would either put links to everything or links to nothing and I would put a footnote if something is Totally not open source or if there's some controversy around the open source License you could just put a footnote with a link to something saying yeah, that was Stevens a suggestion to have an appendix thing just One thing is I trying to understand as the cloud-native foundation parallel Linux foundation and Again, I'm looking at this for the big picture. Nothing. I'm not made to say it like I'm Anyway, so I'm trying to think that we should be very open source friendly and I'm thinking in that model. We should not even have my opinion It shouldn't probably have at all anything with that's close source At all in the paper because then that becomes a marketing thing since then yeah There is what I was afraid of is somebody else with commercial industry. Why didn't you mention us? Yeah, so so I think we should focus very very clearly on something as open source And anything that's close or just not in the paper and paper at all Again, I feel that way because we are part of the Linux Foundation now if we were some other type of Foundation that dealt with both close and open source projects Then then I'll be fine, but I think a part of the Linux Foundation. I feel we are very should be very clear And we are looking at only open source projects And yeah, I agree, but I mean how often do we plan on updating the stock as well? Because at one time cockroach was open as well. So should we even mention names in here? Because that then there puts a requirement on us to make sure that these are updated. I Think I think one can get around that by saying at time of writing Version to Cockroach you can if it was open source one time you can go back to the open source version on shore, right? Yeah But regarding we had this whole debate in the in the white paper Louisa I don't know if you remember it that the problem with your approach so firstly To be clear we do not intend to be exhausted about listing all non-open source or open source databases and we Decided to list a few Close source ones in particular Where they provide useful information to contextualize people's brain. So when you talk about object stores and we people Understand what S3 is the understand it's it's the definitive open Object store it was the first one and it sort of defines the category and so we mentioned it explicitly And I think there are other Categories that have similar things. We did not exhaustively mention every single possible closed source Object store, but we did mention S3 and maybe another one I don't remember and I think we should be consistent I think personally that that principle is sound and I think we should continue with that well in the S3 specifically it's because I Feel that S3 itself is becoming a protocol more than the look than the implementation so when we say S3 we mean the protocol today and We mean the APIs so we're gonna put Minio for example and we say S3, right? so I Feel that that was okay, but I feel that if we start moving towards the needle towards that location we may find I Against is my opinion. We may find that it may become difficult I'm trying to pull the needle back to the open source only model and we can definitely represent. Yeah, I Agree with Quinton where I think we don't want to promote the non open source, but mentioning them if it's a Just a commercial reality that they're popular and people will want to run them in containers is Okay, so long as you don't cross the line into promotion and you indicate that these are in a different category Yeah, just raising my opinion Yeah, I mean Spanish another good example Spanish was you know, there was a lot of Publicity around that there were papers published It was the thing that started a lot of this discussion around distributed databases, etc It happens not to be an open source as far as I'm aware And it's now actually I believe either available on Google Cloud as a commercial offering, but I think it should definitely be here I mean everybody just about everybody has heard of Spanner and and it helps contextualize what the other things are and what the categories are Yeah, just my opinion. Yeah, I guess we'll get to visit each one as they come I didn't quite understand that visit what so visit like, you know, we'll talk about each one as they come in the paper for example Like we're doing right now with the database one I'm just saying we don't need to like make a Blanket statement as they come in the paper. We'll look at each one and we'll make a judgment on each Well, I think we do need to have a principle on which the judgment is made and the principle I'm proposing is that Where it is a name of a commercial product, which is reasonably well known and helps to contextualize people's thinking Then we should mention it. We should not try to be comprehensive Yeah, that's the basic principle Yeah, I find both sides of the argument valid. I don't know which way which way to lean towards But the other the other side is to be clear is to is to not mention anything. That's not open source. Yeah. Yeah, I don't know They both have their merits Yeah, and again that one trying to say is that if we were not part of the Linux Foundation I'd be all for it. But being part of it. I feel that we should be very strongly focused on open source projects Table this we can have the discussion separately rather than spend the next half hour on it. I Can understand both sides of the argument Maybe we should just table either have it in the document in comments or a follow-up discussion specifically on this topic Okay, so We've changed it back to databases We're going to leave this in here for now Louise if you want to add your comments into the Side, that's fine. So we can Also add that if we decide to remove cockroach, we should also remove it from section nine I'm just gonna add comment here Unless Louise was gonna add a comment That no Whatever Let me know what to do. Okay, I'll just add it in Erin, I think you said you wanted to run up to Yeah, I got to catch my plane Was there anything in particular left on the agenda after this Louise's Template he was working on and then you could talk more about what we talked about yesterday with the sandbox draft But I'm sure you haven't been able to go through it. So I maybe we could table that Okay, we'll just put it on the us people to look through it. Is it available? Would you like to make it available more broadly yet or yes, let me do No, I think I'd like you guys to review it first. I was hoping all right We'll try and get it up later this week. Let's say Okay. All right. Sorry guys. I got a jet literally I Can't edit the agenda. I can fix that. Sorry. There were some Strange events with people putting strange things in CNCF documents. So we are a little bit more Pedantic about it. No problem. I Can give that to you. What is your email address? I'm requesting access right now Very somewhere in my email. Okay. I'll I'll see it's Louisa pot works calm. Okay, but Okay, yeah, you should have access Okay, sorry, we derailed your review a little bit there Subu do you want to continue? Is someone talking? I don't see I'm not hearing anything. Oh Can you hear me? Hello? Yes. Yes. I was saying would you like to continue you? Sorry, we hijacked your your document review Actually people were just reading and Uh commenting I wasn't walking anyone to okay. Yeah, so I'll wait for comments So any any other sections of the white paper? We kind of broken into categories and then we discussed along various axes This is just from memory. I've just read your document now. Sorry, sir One we had you know the distributed versus centralized, etc And secondly, we had the relative properties availability scalability performance, etc sort of sketched out in a Sort of summary table. Is that something you've considered for this or essentially if you look at the paragraph after After the mention of the databases because key value stores and databases share similar usage of how they similar way You storage similar in similar ways all the trade-offs that apply to key value stores apply to databases also So I've just said look at the section 9.4 for that Okay, that makes a lot of sense Are there any additional considerations around transactions and those kinds of things? From memory be most of these databases support some form of multi operation transactions where key value stores up and down typically don't That's a good point. I we could get into that because Different systems offer different trade-offs based on the type of transactions they support, but I even I'm not too sure about It'll it'll require the we can say that these provide these days the systems provide a spectrum of trade-offs between Consistency and availability But there are like very strange nuances about examples spanner has one very different transaction model. I believe cockroach is Very very strict acid Yes, and then tidy be may offer trade-offs. We test also offers trade-offs. So it's It's all over the map. So I don't know if we can We should definitely not get into the details because that itself is a huge You like people have each of those things that were covered in like multiple blogs and papers Yeah, yeah, I feel like it should be more like a I didn't mean to talk geology finish first. Yeah. Yeah, that's that's all I was going to mention Yeah, I completely agree if we start getting too deep into the details, we're gonna start creating white papers for each one of these companies. So I think with It I think the goal really is to get enough information for the user to then read those white papers understand them, right? and Not really to describe each one It might take me to the kind of give them like if we were creating if this is an engineering school we did this be the one-on-one of databases, right and They will read it and it will be a true document that would live a long lifetime because it kind of describes What are they used for what, you know, what are some of the models and then from there? They can then understand the specific product, right? It's just my suggestion Yeah, yeah, so what I can do is I can stay away from specific features offered by the databases But I can cover about which parts of acid These systems make trade-offs on Most of them go trade-off between atomicity and isolation and I can mention that I can write a short paragraph about that It's so cool. I think that would be serious Are you going to cover other type of databases like a Cassandra? That's also open source, right? Cassandra is many people think of it as still a key value store Store is that yeah, so I think it fits better in section nine Okay Section sorry Yeah, I don't remember. Oh, okay. So it's like I don't think it's mentioned there. Oh, you mentioned there. I just saw it Okay, it's here actually Yeah, it's on page which pages this I'm not sure which pages, but it's there. It's Cassandra H base Okay, I Think it's ambiguous as to whether it's a database or a key value store I certainly think a lot of people have in their heads that it's a no-sequel database. Yeah That's a common Understanding so at the very least we can just put a reference in both sections we can say Put it in here if people don't see it in here They'll say why the hell they're not mentioned Cassandra under databases. Hello Hello Hello Yes, we can hear you Hello We can hear you Maybe you can't hear us Suga, can you hear me? I just connected Okay, let's wait for him to reconnect Louise just to get back to your point about leaving out the details. I agree the details. Sorry. I got disconnected Were you saying anything the last two last minute or so? No, we actually noticed you getting disconnected We waited for you. I'm just gonna respond to Suga's point about not getting into the details of the specific projects And I agree 100% we can't we can't go into detail about every one of the projects What we definitely want to do is I mean, I think the crux of of the whole issue is this trade-off between essentially strict acid and Isolation of various other things and I think we have to deal with that in a reasonable amount of detail In its generality that there are these fundamental trade-offs that all of these databases make and what the spectrum of trade-offs is and we could even Just taking the CNCF databases as an example TIDB and the tests we could Point out where those fall, you know on the spectrum And they do, you know, they're both configurable and they both have you know configurable trade-offs In general, they fall somewhere on that spectrum. I Think that would be very useful Yeah, yeah That's a very good idea. I can add that And then as for the Cassandra my my personal feeling is that it is you right It is ambiguously a key value store or a database is definitely a lot of people who think of it I will have read about it as a no sequel database. So I think we should mention it here Even if it's just to say we dealt with it in the key value store section see there We didn't forget about it here I think The the I think many of these key value stores are beginning to add transactions as one of their core features and That is the reason that is the bigger reason why they are calling themselves database In that case, it would actually push Cassandra more towards a key value store that supports transactions Okay, well in that case do we We are the way I would put it is I would extend section 9 and say that Key value stores are now beginning to add Transaction support to their systems and are beginning to like Get closer to the features of a database That's a good idea and maybe just make it very clear where we drew the line in our paper You know, we're dealing with some things and calling them databases and some things and calling them key value stores And as you say, there's a bit of a blurry line between the two So maybe we just need to make a statement as to where we Officially drew that line for the purposes of this paper. So I think the line would be in my eyes Being able to use Being able to speak pure SQL with the database right where that's I think that's where databases come from where you connect and Then just send SQL commands and the database does everything for you Well, I think there might be people who disagree with you on that First of all relational databases are only, you know, one kind of database historically There were many before them network databases and all sorts of others objects door. Yeah So so that that's the one Point and the other point is that even within relational databases, you know, SQL is not a given either So I think that would be a little bit of a contentious statement to make it may be a little vitec-centric. I'm not sure Yeah, I believe I'm pretty sure actually a spanner may be it may be the only one that Doesn't understand pure SQL I think there are some things you have you need to call into APIs for I'm pretty sure Cockroach tidy V and we test do full SQL But yeah, I think the lines are blurring now There's there's because it's becoming a spectrum from key value store to your database So, yeah, I mean we can just add Cassandra saying that Cassandra can be Considered by depending on how you look at it Cassandra can be considered a key value store or a database Yeah, yeah, if we if you go by the pure relational database Theory most purists will reject. I think pretty much all these systems as pure databases As relational databases, absolutely and I think relational databases are one kind of database All the you know all the Contention around distributed and no sequel and all of that stuff, but that that distinction existed a long time ago Yeah, yeah, the 90s. We had, you know object relational mappings or m's and all these kind of things and they they in fact Don't even explicitly support an SQl right Provide an object interface Not in the sense of block stores or Yeah But actual just they tried to design a language called oql. Yeah, exactly I'm not saying it's a good idea. I'm just saying they do exist and they are databases And they're not relational and the same goes for hierarchical databases and network databases And I mean that there are many of them outside of the relational database family and they definitely are databases Yep, in people's minds Cool. So in my notes, I have I will add Cassandra maybe with an asterisk with saying that There are systems like there are some key value stores that are Beginning to look more like databases Cassandra is one such example. I think I'll add that Sounds good. And then I will cover add a paragraph that covers the trade-offs that databases make About acid versus availability Sounds good Sorry Shing, I think I might have interrupted you. Did you want to add something? Oh No, I'm all set. I think Suku said he's going to add Something about Cassandra. So yeah, yeah, and a section on and the section on the trade-offs That people make with acid Okay. Yeah, thanks. And Louisa. I think I might have interrupted you too Do is Chatted saying that he has to drop off. Okay This is good feedback Thank you. And sorry it didn't come earlier. I actually only Came across this paper. I was on vacation when you wrote most of the stuff And I kind of lost a few weeks of my life while I was on vacations I mean the the acid part I completely forgot to think about That's definitely an important part to add I Mean we are thinking beyond that The other thing that's common is being able to provide table joins and stuff But I don't think I think that part can be left out. Yeah, even that's Debatable at the risk of dragging this out too much, but I Mean one of if you just zoom out of the detail I think our one of our Responsibilities here is to Educate people as to the spectrum and Differences between all these things that people call databases. Yeah, and focus on the important ones So so obviously, you know consistency All of that stuff the asset stuff is is very Important and some of the things people call databases particularly in the cloud native world do have them Spanner and Cockroach come to mind and some of the anti IDB for that matter and some of them definitely do not have them and I think this is part of what causes all the confusion in this general storage space but in particular in the so-called cloud databases space is Is how big those gaps between Things that people call databases really are and and some of them, you know, obviously the the projects and the products with gaps like lack of acid for example or lack of Consistency in general They don't exactly advertise what they're missing And so people really have to go and dig around and figure this stuff out for themselves So what we're trying to do is help people not have to do all that homework Yeah, and in one place say these are the you know, these are the properties that Can be present in a cloud database If you if you choose these properties, you typically can't get these other properties because they're inconsistent. We've got the And and these are the trade-offs that that have to be made they're not you know Flaws in a particular engineering teams thinking. They're just fundamental trade-offs Yeah Pick which one is picking off. I will also add The other consistency trade-off, which is the read after write consistency, which yeah I mean every one of these fields is actually a whole nightmare in itself There are some sections on these things in the rest of the white paper and you can refer to them We're possible. There's there's for example, I wrote a specific section on consistency I think and we can expand that if you if you have some more thoughts on it. Oh, yeah, I can That may be actually a good idea because That's it because some of those things apply to other systems, too I think the way I would do it is do the read after write consistency as a generic section because that applies to multiple other data stores but acid specifically to databases because They are kind of you don't talk about acid if you are not talking about a database Yes, you you tend to talk about parts of it sometimes like you know object stores have a concept of Atomicity sometimes, but yeah, I agree as a group of four properties. They usually apply to databases. Yeah So I will I will then add a section on I will take a look at the consistency section See if that can be enhanced to Cover read after write. Yeah, it's not very comprehensive I wrote it in a rush and the main aim was just to tell people that consistency means a whole lot of things So don't don't believe that you understand it until you've read all of these various papers about consistency But if you want to if you want to add some Detailed stuff there or some more stuff about read after write. I think that should be valuable Sounds good. I should start taking notes now Sorry, we could have done that too, I think the last items on the agenda were all Okay, so I think we have two more items on the end of that I'm aware of and sorry I dropped in late I hope this is up to date. So Louise has been doing some work on various papers Which we'll ask him to give us a sort of an update next time The other thing we have been doing an error in particular has been there's been a little bit of a concern I think about inconsistency of Dealing with projects that apply to the CNCF and I think particularly the sandbox level. There's been Some complaints leveled and so I think it would be useful for us to Put together a detailed Workflow this is how you apply to the to the sandbox and and this is exactly what the responsibilities of the TOC are specifically with respect to Timelines, you know, you will get an accept or a reject within a certain bounded time frame Etc rather than some of these projects which kind of drag on and drag on and can't find sponsors and this and that so It the documents not ready for broad review But hopefully it will be within the next week and we'll send it out to the mailing list But it is there is a first draft of it out that Erin's just tidying up for general consumption Anything else that anyone wanted to cover before we wrap up? There's any four of us left No, all right, if that's it. Thanks very much everyone. We'll see you again in I think it's two weeks Thanks