 Welcome back everyone, live CUBE coverage here in New York City for Mongo.locals to CUBE. I'm John Furrier, your host. It's a packed house here in New York City as Mongo kicks off a 26 city tour out in the streets going through the way the developers are. Not the big a conference one time thing. It's a big conference in New York but they're going to take it on the road. We've got a great panel here. We're going to talk about schema and data modeling, MongoDB data modeling and schema design, schema list, when do you schema, not three gate guests, the authors are here. Daniel Coupelette, director, developer, advocate at MongoDB, welcome to the CUBE, Steve Hoberman and Pascal Desmarais, founder and CEO of Hackalade and the authors of MongoDB data modeling and schema design. We're going to discuss the book and how it fits in today and also talk about the industry. Gentlemen, thank you for coming on theCUBE. First question, who's the book targeted for? Data modeling, schema design, I'm thinking database, MongoDB obviously in the title. Who's this designed for? So it's primarily written for two audiences. One audience are going to be data professionals who might have a lot of experience on traditional data modeling, data governance but need to know how to do this for MongoDB and the second audience are for developers. MongoDB, the document developers who are very skilled but they need to know proper data design techniques. I don't know if you guys want to add to that or? Daniel, you're the MongoDB expert. How does the book come about? Well, Steve will probably have a much longer story to tell here but I'll make my story. So Steve, who's been teaching relational database for years, wanted to have a Siri on NoSQL databases. So he started the Siri, he has the goal of getting six of them and he created a core that is used in all the books. And the specific part of the database is for this one, he needed some experts. So he knew Pascal and got in touch with Pascal and I received a letter from Steve, I registered, I haven't known Pascal for many years and I seen that the newsletter that these two were collaborating. So I sent an email to Pascal, he's been a friend, congratulation, you guys know you're writing a book together, it's great. By the way, I'm also writing my own book, Ad started my own data modeling book at that time and Pascal's reaction is like, well, you should join us. And then we had a discussion and we realized that what I've been expert in was a good contribution to the book and we divided the work and went on and to me the collaboration has been absolutely fantastic. So I do consider that day that I read that newsletter and suddenly got to pass out. You guys are all writing books, you're all scratching the itch of data modeling and drinking alone, some wine, you say, hey, let's have a glass of wine together. I'm sure a bottle of wine was involved, Pascal. That's what everyone wants to know. Beer, Belgian beer, Belgian beer, okay. So okay, data modeling is hot right now, you hear, the model, foundational models, data modeling is official practice that's been around for a while. Where's it, what's happening right now from the old school to new school data modeling? Because to me, the book is about, MongoDB is the modern database platform now, it's not a database, it's many things and it's becoming a data platform where data modeling and schema, there's sometimes there's schema, sometimes there's no schema, what's this about? So one of the greatest differentiators of MongoDB versus the other technologies is the document model. It's this ability to store data in JSON, which is very developer friendly and very flexible and powerful. The danger is that it looks so simple that people, it's deceptive, people get the wrong idea that they can build very complex things without really thinking it through. And that's where we come in by saying, hey, MongoDB is great, if you really want to leverage all of its capabilities for complex use cases, complex organizations, you really need to think about your data model so you can best leverage the features of MongoDB. So give me an example, because what I hear you saying is that in a way, Mongo is so successful, it's easy to use. And then next thing you know, you're doing complex things in a very easy way and I won't say the word technical debt, but I'll just say it, you might get some technical debt in there. You say, hey, I got a successful application. I got to do more. So I won't say refactoring, maybe replatforming, might be the better word, but take us through a use case example of what happens when someone gets in and needs to think about the design and the modeling. I could start one if that's okay, jump in. So we all I know have a lot of these stories, but the first step is about a line. And what a line means is coming up with a common vocabulary. So for example, I was recently working with an organization in the airline industry based out of Montreal. They did not have a standard definition on what an airline was and what a flight was and what a leg of a flight was. So that first stage, how can you build a system without knowing, for the airline industry, without knowing when an airline or flight is? And so that would be the first step on a line coming up with a common vocabulary. And then, refine is all about the requirements. Now that we all speak the same language, what do you need? And notice these are independent of the document model and MongoDB and specific. And then design is about making it work. And if you guys want to add to that, design's all about the secret sauce. So coming up with a lot of the things that Danielle was talking about this morning, for example. Anyone want to chime in? When we started many years ago at MongoDB, a lot of people didn't think that you had to design schema less, we heard a few things. And the project was small and we grew and companies who are bigger started getting interest in MongoDB, it was not just startup. If you're working by yourself, you don't need to model as much. You should still take care of things, but it's not the same thing. But these days, most of our, a lot of our top customers are financial institutions and they're used to have a modeling process. They're used to have means to exchange information between themselves. And this is what the data model is. And the world has changed too. In the past, system didn't have to be up 24 hours a day. They didn't have billion of documents and rows. And now we have that. So you need to understand how you're going to be using your data. And this is what data modeling for MongoDB is about, is understanding what you're going to be doing and then going through all the phase to make things no better and apply schema design patterns at the end for optimization. Things that we didn't feel necessary in the world of the relational database in the 70s. When I hear schema, it's a trigger word for me. I get flashbacks. Okay? Oh my God, schema design. I just, my mind, oh my God, I got to lay it all out, all this work I got to do. It doesn't, doesn't feel productive to me. It feels like, it feels like a lot of work. To some, maybe, oh, I love schema. I want to get in there and model. So developer experience is about ease of use. On one hand, but efficiency and scale on the other. Where does that meet in the middle? Cause I can imagine AI coming in and disrupting this or creating value. You know, build me my schema on the fly so I don't have to. Hopefully you'll never get through it. I'm throwing that out there as a way to kind of stoke the fire a little bit cause you have two sides of that coin. I don't want to, I don't want to see schema. Someone else should do that. Or I love schema, I want to do it. So I think that the data model is a communication tool. The developer is very good at its job and he knows a lot of things. But you have in complex organizations, you have subject matter experts who know the business. And the data model is a way to communicate between each other. Just like a blueprint is used by an architect to make sure that the contractors know what to do and that the owner understands what's going to be built. So it's a communication tool and it produces a schema which is the contract. And the contract, you know, in IT bits and bytes, it's got to work in the end. A contract's important. And even though there's schema less databases, you still have a schema the minute you store data in the database. Even if it's never documented. Good explanation. Okay, so how does it go from here? So in the book, the premise is what? You should be doing more data modeling. How does data modeling get done in the future? Is it going to be input based? Or is there automation? What's your thoughts on that? You'll need to go through these steps of a line-refined design. But the level of detail probably has many factors including how complex is the solution. Do you want to do relational to understand the rules as well as the document kind of model? Those are all factors that would come into it, I think. What's the biggest learnings in the data modeling? You know, there's a lot of deduplication costs in one end. You got speed and agility. At the same time, you want to be fast. The fundamental difference with no-SQL databases in general and MongoDB in particular is that you need to think in a different manner. You need to think about access patterns so you can leverage the performance capabilities of MongoDB. And maybe you want to talk about, Daniel, about the different patterns for schema design. Yeah, so the most critical part to get good performance with MongoDB is to apply transformation that we catalog that schema design patterns. And so may include that deduplication. And in that case, it's a trade-off. Do I want to do that deduplication so my queries are three times faster and I need after hardware? And I'll pay a price of having a process that does an update once in a while. All these are choices. And you have to be able to make them. I think in the past, it was too rigid about things that have to be normalized and like that. But you knew as soon as you deploy your relational project, someone will come and denormalize something because it was way too slow. Joins are way too expensive. So with MongoDB, I think that's what people learned. And to me, this is also why the no-SQL movement was born is there was a need to have way to store things differently that address the problems of the modern world. The modern world has data sets that are totally like many order of magnitude bigger than what we used to have in the past. And this is where we need to start thinking about the data differently. Are we going to be using it? One last thing, if we could freeze the size of all the data sets in the world and increase performance, I think we would need to model because there would be a point that things would be so fast that they would fit. But the problem is that that's not going to happen. The data sets seem to always grow much faster than the resource we can true at them these days. So, it is, it is. I mean, we've been talking about theCUBE about these new insights that are coming out of this. If you do it right, the reward is new insights that AI will bring to the table. You mentioned Pascal, you're a fan of the document model. You kind of laid out the benefits. It's almost as if the document model was kind of pre-ordained as a pre-AI mechanism. Because if you look at what's going on with the large language models, it's language, they're in documents. So it's a lot more friendly, the document model, at least in my opinion, to AI. I'm sure you guys agree, or not, we should talk about it. But then you say, okay, how does that scale? Because it's not, it's going to be powerful, but there's still other data sources coming in. You have to deal with other data. Yeah, so one of the amazing things of these NoSQL databases versus relational databases is that they reveal relationships that you didn't know existed. Whereas in relational, you're forced to think through, to think about the relationships that you want to see in the data. So I think that indeed NoSQL brings... No serendipity in relational databases. Hence my trigger word, of schema design. But I think schema, generative schema could be a technology. I mean, generative AI is something if you see a relationship forming, this is why I see what you guys are doing as relevant. If I'm a developer building an app and the data's sitting there and there are new connections and maybe neural net connection that's not there, that could be formed in real time and coded in at the point of code. So that's an auto schema creation. I mean, I made the term up, it's kind of an oversimplification, but the trend is going that way. How do you set up for that? That's the question everyone wants to know. I want that set up for me so that if I do have that opportunity to create a value proposition out of the data, I want to capture it. That's the big question right now and nobody really has the answer I haven't yet found it. It's the holy grail. And even if it goes that direction, we still need to know what is a customer? What is a product? What is a quote? What is a, you know, so there's a certain amount going back to the model being a communication tool. That part of it can't be replaced by AI, at least I hope not, at least for a few years, but that part is still so important to what we're doing. I mean, we're kind of riffing in real time here, but that's kind of sounds like not a data scientist, but more of like a miner, like a mining for gold. You're looking at the data, you're looking for patterns, there's coding over the top, that's either auto-generated by AI and or human, and this opportunity to set the table for new functionality, whether you're a startup or a company. And I interviewed a company that, your customer current, they're young team. Their entire discipline is code fast, doesn't work, come back. It's a two-way door. They go in, they try, they pull back. You know, my generation's, no, my idea is going to run. I'm telling you, it's going to be big. And then, you know. We love that. At MongoDB, absolutely love that because we think we have the best database for that. We have a pattern in the book called the schema versioning pattern where we explain how you can make updates to your schema without down time. And it's something in relational database you couldn't do. Migration and changing the schema was always painful. It's not with MongoDB. So for us, you know, all the companies who say, go fast, you know, I'll change the schema, make things better, add functionality. It's going to add whatever arrays and sub-documents. No, we do that very well with MongoDB. I thought the keynote had very visionary thing up there. They talked about the role of voice. Voice commands. You know most, you know, hey, cube, get me something. Hey Siri, you know, hey, Mongo. Man, build me an app for, so you start getting into this value proposition where it's not so much data science as it is discovery. So like if you have this discovery mechanism, I don't know what to call it, but it's like, it's the data sitting there that has new insights that nobody has. That's where AI is shining right now is people discovering the use cases. How do you guys talk to other people in the industry who are kind of like locked in to say relational or we do it this way? Our company runs on blank. Sorry, we're going to use these databases. Yeah, that's a great question because people bring with them their experiences. So somebody who's been designing a database in Oracle for 20 years, 30 years, that's going to be a mind shift. I think even writing the book, we had those kinds of conversations. I might approach it one way, Daniel and other, Pascal and other. So that's not really a technology answer. It's more difficult, more of cultural, more of our experiences. Not easy. I think data modeling motion is a developer motion because you got to foundationally lay out some stuff and let it kind of run. True, true. Are you guys see anything on the AI side that gets you excited about where modeling can go because you hear again, foundation model, training my model, is there training data? I mean, all this is like training, I always say it's like pets. We're trained on a train my dog to jump. Data is becoming that intimate for companies where data is the lifeblood and you're hearing words like training data, model the data, scale the data, leverage the data. That's very actionable thinking. And you guys are at the heart of it, data modeling. It feels like something is in there. Yeah, definitely, definitely. I mean, I know from my perspective how I've been using tools like these language learning models. I use them in three capacities. One is I use them to come up with better definitions. So if I'm working with a university and I have a starter definition for a student, I could use these tools to come up with a better definition. I also, when I teach modeling, a lot of what I do is question and answer. So ask these six questions, you build your relationship. Possibly a language learning model can do that for me. See, when you're teaching and you're running your events, now writing the books, what's the psychology of the people in the data modeling world right now? Where are they at? They got to be sitting there saying, wow, everything I'm doing, I got a superpower. I mean, if I'm a data modeler, I mean, you know, it's a lot of grinding, okay? A lot of grinding and a lot of great work and getting in the weeds, getting the data, get your hands on it. But now they're like at the center of the value proposition. What's the psychology of these, that's persona, the users, the operators, the developers? It's really interesting. I don't know if you guys would like to add to this, but the typical data modeler is us, right? So we're kind of, we're not the superheroes. We like to be behind the scenes, I think, and making, we're not the sexy jobs. We're kind of making sure to use Pascal's analogy from earlier, if we're building a skyscraper, we're probably the ones laying that foundation. We're always the unsung heroes, but without us, the building would fall, right? Yeah, it's a critical role. I mean, you can't build a building without drawings, without drawings. Yeah, that's right, that's right. Oops, shouldn't have put that wall there. All right, final experience. What was the book's, what was the biggest debate in the book? What to take out to add in? What was there, was there any wrestling and like? It was about whether logical model is truly technology agnostic or not. Yeah, exactly. And so, you know, this is an intellectual debate. I don't know how many people are really interested in these nuances, but it, you know, it brought some interesting conversations. We don't want to get too much away because we actually added comments in here that discussed why we felt differently even. All right, well, how can they get a hold of you? Here's the book here, MongoDB data modeling and schema design. How do people get a hold of you guys because they want to reach out? Is there a website, pitch in LinkedIn or? We're all in LinkedIn. Yeah. And the book is available on Amazon. All over the world. Yeah, get on Amazon, download it, ebook too. Everything, all formats. Google, PDF, right? Any format. All the models were done in hack-a-lade, so hack-a-lade. The models, yeah, all the illustrations are done with our software and we have a repository on GitHub so people can get started easily. Did you ingest this into the generative AI and knowledge base yet? I'm not scared yet. It's definitely going. For the next book, we'll ask Josh to be here. For the second edition. For the next book. Daniel, give me the final word, Mongo. What's the hottest, next big thing here? Devil Advocate, developer-first, developer-led platform. You guys really have a great focus. One of the greatest thing we've discovered this year is the team that we built in which I am now that specialize in doing design reviews with our customers or top customers or strategic customers. And we realize how much value there is in helping them on their data model. Their group would need to communicate with each other and there's a lot of things they don't really know and just getting them over the hump to do a good model with MongoDB as being really incredible for our organization. And the global distributed system you guys have. Great failover, great performance. I mean, it's going to be. It's all the features and the product. It's going to be a treasure trove of insights and serendipity in the data. So, like I said, get the book. All right, this is the CUBE live coverage here. Back and more here at MongoDB Local in New York City, packed house. I'm John Furrier, we'll be right back.