 Hello and welcome. My name is Shannon Kemp, and I'm the Chief Digital Manager of DataVersity. We'd like to thank you for joining this DataVersity webinar, moving from a relational model to NoSQL, sponsored today by CouchBase. Just a couple of points to get us started. Due to the large number of people that attend these sessions, you will be muted during the webinar. For questions, you'll be collecting them via the Q&A in the bottom right-hand corner of your screen. Or if you'd like to tweet, we encourage you to share our questions by Twitter using hashtag DataVersity. And if you'd like to chat with us or with each other, we certainly encourage you to do so. Just click the chat icon in the bottom right-hand corner of your screen for that feature. As always, we will send a follow-up email within two business days containing links to the slides, the recording of the session, and additional information requested throughout the webinar. Now, let me introduce to our speaker for today, Matthew Groves. Matthew is a guy who loves to code. It doesn't matter if he's sharp, jQuery, or PHP. He'll submit full requests for anything. He has been coding professionally ever since he wrote a quick basic point of sale app for his parents. He's a shop back in the 90s, and he currently works as a developer advocate for CouchBase. And with that, I will give the floor to Matthew to get today's webinar started. Hello and welcome. Hello. Thank you, Shannon. Can you all hear me okay and see my screen okay? You sound great. Yeah, let's go. If you can't, just say so in the chat and we'll see what we can do. But I vote that we all just go back to listening to Ella and Louis Armstrong for a while. That was really great. So thank you very much for hosting the University in Shannon. And my name is Matt Groves. Actually, that by is a little outdated. My title now is Product Marketing Manager, but I'm still very much a developer focused guy. I do still enjoy lots of coding, but I'm now working with the marketing team to try to, you know, give a little bit more of a technical view to product marketing. So anyway, with that in mind, today's objectives are we're going to give an overview of how we might think about migrating in a existing database to NoSQL, in our case, CouchBase. But even if you don't plan to do that and you want to start with NoSQL on a Greenfield project in the future, this will help you compare and contrast NoSQL with relational. So even if you're not migrating an application, you can still think about migrating, you know, your next project and migrating your mindset and your skill set. So this will give you a sense for some of the technical issues and differences between the two. We're not going to get into really deep details here. I'm happy to discuss those with you. I'm a very technical minded person. I'm happy to discuss those with you afterwards, but we're not going to get too deeply into those today. Yes, so this is the agenda. We're going to do a quick intro to NoSQL, quick intro to CouchBase, some of the migration options at a high level you have, how you might then go from initial migration to optimizing, and then I'll show you a quick demo of some NoSQL in action. But again, we're not going to get too deep into that today. All right, so introduction to NoSQL. There are use cases where you need to track a lot of information. So for instance, e-commerce. It's not just about tracking the purchase. It's also about tracking what went into that purchase. What items did a user view that they add to a cart? They removed from a cart. They do searches. They do reading and leaving reviews. They're asking questions, adding to wish list, all these activities that go into sort of the one transaction at the end of all those activities. And that can be a huge amount of data. In other verticals, we see a similar thing in the supply chain is all the stuff that goes into building it before you're fulfilling something as an employee, or IoT, all the things that you go into operating the IoT device and then ultimately providing a service. And this can be a huge amount of data. Sometimes this is called massively interactive enterprises. And the application requirements are dramatically different for these types of use cases. So you have your transactional database, traditional relational database we're talking about. You also have an analytical database. This is for reporting or BI. You might use an ETL to dump data in there every 24 hours, for instance. And NoSQL really overlaps these cases where scale and flexibility and high availability and performance can provide lower costs. So especially when it comes to storing those interactions, all those activities leading up to the end result transaction. And the first generation of, first generations of NoSQL don't really have capabilities like joins or asset transactions. They don't have SQL as a query language. So they don't fit every transactional use case out there. Despite that, historically they've still been used for some of those use cases where high performance and scale are really, really important. So places where we have a high level of interactions like microservices, you know, large scale consumer facing websites, IoT, etc. But Couchbase now being a later generation mature NoSQL database actually supports SQL, supports SQL with joins, not just joins, but the full SQL language, and supports asset transactions. So we're actually going to start to see a bit of a shift in the use cases that NoSQL can deal with. And just to, again, a little technical detail here, NoSQL databases, we're talking about document databases. Like Couchbase, they store data typically as JSON, not always, but most of the time is JSON. And this is different than relational databases because data can be much richer. You can store arrays and nested objects and arrays of nested objects. And sometimes they're called sub documents. And so this is just a quick refresher slide on JSON. You know, you're probably familiar with this. You've at least heard of this. This is something that is a very, very common notation language for data. But why JSON? For a distributed NoSQL database, a key tradeoff is that each piece of data needs to be independent to make distribution easier amongst all the machines in a cluster. Now, even though that data is independent, so it's not relational, it's not in a table, it still makes sense to put it into a useful format. So why JSON then? Of all the formats, why will we pick JSON? Well, I think the reason is JSON is widely supported by every language and platform that I can think of. And its use goes beyond databases. And of course, it comes from JavaScript. So chances are JSON is already being used widely in your applications and enterprise. And I would think it's fair to say that most developers are at least familiar with JSON, if not experts in JSON at this point. And this I chart over here just to drive home the point is a list of, might be hard for you to see over the stream, but it's a list of JSON parsing libraries by programming language. So it's widely supported by every technology out there. You might be surprised some of these technologies on there that support JSON. So JSON is a very good format. That's why it's most commonly used for document databases. So let's look at a comparison here of tables versus document sets. This is just a really simple comparison. So on the left, we've got a user table and a product table. Now on the right side, there are no SQL database. We've got instead of tables, we've got something called collections. So they're not exactly the same, but you can kind of think of them as analogous. On the left, we have a table. Each table has a number of rows of data. And so on the right, each collection has a number of documents. And so you can kind of, again, not a perfect comparison, but you can kind of equate a row of data to a document in a no SQL database. Now these aren't just naming differences. There's underlying differences in how these work. But for comparison purposes, if you're coming from a relational background, this is a good place to start. So you can kind of see the equivalence there between the first row of the user table and that first document in the user collection on the right. So this is a small example here of the difference between a relational and a non-relational database. On the left, every row in that product table there, that's the one on the bottom left, it must have a value. Every one of those tuples must have a value, even if it doesn't make sense to have a value. So a product catalog, quantity might not make sense. For instance, if you're storing a service or something in the product table, it might not make sense to have a quantity. But since it's a relational, we have to put something in that quantity field. So maybe it's a null or I've seen before maybe a magic number like zero or negative one or who knows what else it could be. We've got to put some sort of value in there. On the document side, however, the quantity can just be flat out omitted. If quantity doesn't make sense for a product, we don't have to have quantity in there at all. Now on the flip side of that, if I want to add a restock value for my product, maybe I need to restock something. Again, if you're putting services in your product table, then restocking it maybe doesn't make sense. But if I want to add a restock to certain things like the blard nozzle or the hepto shaft, I would need to add a column. And again, a value for every row in that column. So probably have to fill it up with nulls or negative ones or something like that, even if restocking makes no sense. Now on the document side, I can add a restock field to a single document in the collection if I want to. And it has no effect on the other documents in that collection. Now this is just a small taste of the schema flexibility that is afforded by Document Database. And this has made Document Database is a great choice for use cases like content management, catalogs, IOT, use cases where there's a lot of data and a lot of variation of data. Now of course there's a trade-off here. The schema is not enforced, at least not at the database level. So that's something you have to deal with as a trade-off there, but you can see that as a benefit or a trade-off. In many organizations, this can help with agility. So it's no longer in some organizations, there's no longer an interdepartmental memo to get a feel added to the table. And therefore you can free up DBAs to spend more time being proactive in the database administration, instead of having to spend lots of time responding to requests for field changes. So that's just, again, a small example of how it can help with agility, both technical and organizational. And one more small note before I keep going along here, this is an important one that I get asked a lot, so I figured I would talk about it, is JSON famously does not have a date time in it, or a date type, I should say. This is just a part of using JSON, there's no date type. So like many things in those SQL, you've got lots of options. So here's just three examples here. We could store it as ISO 8601, like a string. We could store it as an array with all those components broken out into individual elements of the array. We could also store it as a UNIX timestamp, like an epoch over there as a number. Now, among those three, of course, again, it's going to be up to your use case, but the most efficient to store is going to be the epoch or the UNIX timestamp, because that's a long range. And the string is going to be a lot more bytes than that. But it really depends on what you're doing and how you're querying the data. And so you have options here when it comes to date. Now, certainly, the JSON doesn't have a date type, but any sort of query language or interaction with your data should have a variety of functions or options how to treat various pieces of data as dates. So I could have a function that converts a string to an epoch or an epoch to a string, compare dates and get a difference, things like that. So we may not be able to store a literal date type, but we can certainly treat these fields as if they were dates. Okay. So that's sort of an intro to NoSQL and document databases. And I want to dig into CouchBase here a little more specifically now. So this is a diagram of CouchBase. And there's going to be a quiz on this at the end, so make sure to memorize every single one of these wedges. I'm just kidding about that. But this is kind of how we like to represent CouchBase. There's that solid core in the middle that everything else in CouchBase has been built upon. And these are kind of the three principles of how CouchBase is built and how you work with it. So the top one is developer agility and versatility. So we already talked about JSON. Certainly, you can store JSON data in CouchBase. It's a document database. And adjust your schema in an agile way. And you can also use CouchBase as a plain key value store, if you like, or if you need to. So you can actually, it's a multi-model. You can do either of those. It's also what might call a multi-mode. So CouchBase has a memory-first architecture. And so it's got built in cache, basically, as I'm saying. And we also introduced the ACID transaction capabilities this last year. And one other thing you can do with CouchBase is the core of it was built for operational workloads. But we've also added an analytics workload option to it as well. So you can use the same cluster to perform your operational work and your BI analysis analytic work on the same cluster, and the data gets shared and updated in real-time. And programmable as well, just making it easy for developers to work with. So the schema flexibility, of course. CouchBase was really the first NoSQL database to introduce a full SQL implementation, which we'll see some later. That's called Nickel. And we've got plenty of SDKs for the popular programming languages. So .NET, Java, Go, Node, and Python, what have you. Performance is an important consideration for CouchBase. So we're not just trying to rebuild, you know, a relational database. We're trying to continue to have a focus on performance. And so with that in mind, the core is to have a scale out that's not going to be a hassle. It's a shared nothing architecture. Many parts of it are asynchronous. And the architecture is elastic. Replication is built in not only at the cluster layer, but between clusters, between data centers. And so with those capabilities, we can have something that's always on with high availability. It can be globally distributed with replication. And it can be also, you notice here, the yellow wedge there. We can have a mobile component. So we can go all the way to the edge with an edge device or a mobile phone device and have a database running there that can synchronize more on that later. And it can be easy to manage. So all these wedges here can be managed independently with workload isolation. They're not separate components, but they are, they can be isolated within that same cluster. Automatic cluster rebalancing. This is a really, really cool feature. So if you add more nodes, more computing power to your cluster, the rebalancing takes place automatically. You don't have to worry about hotspots or configuring any sort of sharding or anything like that. Location and deployment agnostic, meaning you can deploy Cowspace anywhere you want to or need to. More on that later. That includes Kubernetes and microservice. I like to think of Cowspace as the best of both worlds, NoSQL and relational. So for instance, these ones that are highlighted right here are the benefits you get from NoSQL. You know, a key value store, JSON documents. We actually got a full tech search engine that's one of those components that you can, that's included with the cluster. Memory first, asynchronous architecture and master list. So all those things that make NoSQL attractive for large scale deployments, high availability, easy scaling deployments. We've also added SQL, been there for a while, and we've just recently added asset transactions. Those are the kind of the best things from the relational world and indexing as well to support the SQL queries so that they are performing as you can make them. You want those queries to run as fast as possible. I mentioned that it's deployment agnostic, also sometimes called cloud native. You can deploy Cowspace basically anywhere and how you want to. So you can deploy in your own private cloud, your own data center. You can deploy it to bare metal, to VMs, to Kubernetes, however you want to in a public cloud on VMs or Kubernetes there. And we've also introduced recently a fully managed DBAS database as a service, and that's managed by Cowspace. And right now the options with that are AWS and Azure. There's going to be more to come, more options to come there, but that's called a Cowspace cloud. And which one you choose is going to be up to you. And the great benefit of this is you actually do a hybrid approach. You can, with Cowspace's core replication and sync capabilities, you can run a hybrid. You can have a disaster recovery in your own data center and the main cluster in your cloud or vice versa. I mentioned the mobile components or sometimes called the edge these days. It's a device that may be running completely offline in a low or no internet location or a place where internet is very expensive. And then when you collect the data or browse the data in that location, when you get back to a location with internet that's more easily accessible, you can sync up the data when that data is available. So it's an offline first approach with Cowspace Lite there and sync gateway to manage the synchronization. Of course, Cowspace Lite can run completely independently on its own, but of course using sync will give you a lot of benefits if you want to keep the data in sync with other users or with your main data center. And synchronization across different data centers. This is a tool called XDCR built in the Cowspace. And again, it can include your on-prem data centers, your cloud or both. And these are active, active replicas. They can be used for data recovery. You can use some geo-fencing with them. You can just use them to reduce latency to have data centers closer to European customers, for instance. And this is a feature that's been in Cowspace for a long time. It's part of the core of Cowspace. But users love it and we continue to make it as flexible as we can to support more and more use cases. Okay, so that's a quick note on Cowspace. So let's get into some of the migration options. So let's say you're facing this problem right here. And I'm guessing at least some of you are, is that your relational database is working fine. But maybe there's something coming in the future where there's new demand. And you aren't sure if your relational database can handle that scale and or that scale might cost you a lot. And so you're looking at other options like Cowspace for some or all your workflow. It doesn't have to be an all or nothing. It can be just part of your workflow, for instance. It's a very common situation. So the issues for you to consider are pretty much the same as any sort of migration. How risky is this going to be? Does our team have the skills or expertise to work with the new database or the new tool? What's it going to cost to do the migration efforts? And once we do the migration, is it going to be fast enough? And once it's migrated, can we scale? Can we handle that increased scale? So with that in mind, there's a list of some of the most typical options for database migration. And this is kind of matched to risk and effort. But of course, there's also other factors in here, like I mentioned cost and do we have the right skills? So number one is kind of the highest risk, highest effort. Just rewrite the whole darn thing. Just drop the old system, maybe a few pieces of it left over, but just build the whole thing over again. That's the riskiest and the most costly and the highest effort option. Number five down at the bottom is the lowest effort, lowest risk option. Number two is we basically just try as best we can, make the best effort that we can to move our data as it is over to the new database, and maybe take our queries and data access and move them over to the new database, just as they are without having to do any sort of remapping or remodeling or anything like that. And no optimizing either. And so then you can see as you, once you do number five, you can take a step up to the number four at that point. So we've actually got it moved over and it's working. Now, how can we take advantage of the unique capabilities of NoSQL to optimize it, to get it running faster and to refactor to work best with that new technology? In some cases, number five might be all you need. That would be a very good situation, but you probably will need to go at least number four at some points. It may not be the whole thing at once. Again, maybe just parts at a time. We'll discuss that here in a little bit more. So that's what I want to tackle first is migrating without denormalizing. So I mentioned that with JSON you can have rich data structures. You can have nested objects. You can have arrays. We're going to basically take a first pass at migrating data without doing any of that, without taking advantage of any of those capabilities of JSON. And so I like to use a kind of a paper shredder metaphor when I talk about relational databases, because data is heavily normalized and then is joined together when necessary. So things like a paper shredder, let's say I fill out a tax form, which I'm working on already this year, filling out a tax form on a piece of paper. And you know, a tax form has all those different lines on it, all those horizontal lines. So then I run it through a paper shredder and I take all those horizontal lines and I store those shreds in their own folders. So line five goes into the five folder, line six goes into six folder, et cetera. And then when I want to look at or change the form, I go through each of those folders, pull out the shreds and line them up in my desk and I put the document back together. And that's kind of how I see a relational data store working is that I have all these different pieces that are split up and stored in multiple tables. Now, in a couch base or a JSON database like couch base, you don't have to put them through the paper shredder. In fact, a document database is going to perform better by keeping all those shreds together and not shredding them in the first place. However, if we're looking at the easiest option and we're migrating from relational, it may be best to keep them in the normalized shredded state, at least at the beginning, and then optimize them as we need to as the system goes forward. So we're going from a record-centric sort of 2D paradigm to a document-centric model. But we're going to start with just keeping it at that 2D paradigm that most developers and most architects are already familiar with. So that also kind of helps solve the, do we have the skill for this migration? So in a relational database, we saw this kind of earlier, you have tables which contain rows. In couch base, you have collections which contain documents. And so this is kind of the Google Translate version of the migration process, is we just take the one and take a row and make it into a document. Take a table and make it into a collection. So I have, instead of a user table, I have a user collection. And then instead of two rows, I have two documents. And I translate the data into JSON there. And I have a relatively flat, simple JSON object. So I can take the primary key of my relational table, which was blue123-4 in this case. And I can make that the document key of the couch-based document in this case. So each document has its own unique key. I mentioned, you can use it as a key value store. This is kind of the underlying structure is key value. But I have JSON in there as my value. And all the values from the row, so age 42, name, and admin all go there into the document. I could also store a copy of the user ID inside the document if I want to. Do the same for the second row. Now we've got two documents in a collection. And so just a quick look at then what denormalizing would look like. So notice the shredded version on the left. This roughly corresponds with a single document on the right. So we've got three tables and four pieces of data on the left. We've got one document, one piece of data on the right. And now by doing this, this makes horizontal scale easier because each piece of data is isolated. It's not beholden to a given table or schema, which means it can live on any given server in our cluster. So this is how you might want to end up eventually. To start with, you want to do a very simple translation of one row of order to one document and so on. But eventually you might want to get to the point where you have just a single document that contains an order that has all those different piece of data already embedded into it. In the long run, to improve performance, once you've migrated the shreds over, you can start to selectively denormalize where it makes sense to. So some of the shreds, they might just stay as shreds indefinitely. So you can have your data storage, though, aligned directly with the object modeling or application. And this is a problem known as impedance mismatch. If you use a tool like Hibernate or Entity Framework or an ORM like that, you're probably familiar with this kind of, those tools are to help solve that problem of translating denormalized data into a object for your application. Now with your data being denormalized, you don't need to spend as much resources on joins and transactions. Although, again, just to reiterate, to put your mind at ease here, a NoSQL database and mature one like CouchBase supports both joins and transactions. And now finally, storing denormalized data makes CRUD operations super, super quick, not to mention that CouchBase again has that built-in memory-first architecture. So those denormalized reads and writes will often be reading and writing to memory instead of waiting on disks and joins. We'll come back to this a bit more when we talk about optimizing. But even with your data denormalized, you still have the ability to write the SQL queries and indexes. So you get the flexibility either way. Now speaking of SQL, I've touched on this earlier. Let's take a look at what SQL queries look like in a relational database and what they look like in CouchBase. So I've got kind of a SQL server example here on the left, kind of a pseudo example, a SQL server. This could easily be Oracle or Postgres or MySQL or whatever. Just kind of get the idea there. This is a query that returns a list of all the landmarks that are in the same cities as major airports. So I have a subquery here, selecting everything from landmarks where the city is the same as the city in the airports table. And then I have the CouchBase query on the right, which the syntax is largely the same here. So the main difference is that instead of selecting from a SQL server, it's called a catalog. And then you have a schema like DBO and a table. In CouchBase, we have a bucket called travel. We have a scope in case I'm using the default scope and a collection called landmark. Otherwise, it's pretty much the exact same query, which again, this is going to help you if you're trying to make that migration. In that you already know how to write SQL. And so you can just take the exact approach with a few tweaks. It's like you would between any given two different SQL implementations. There are going to be some differences. But you can be able to perform the same sorts of things in CouchBase with the nickel query language, N1QL. So roughly speaking, again, a table is like a collection. They aren't exactly alike. So a collection does not have a defined schema that it must adhere to. Joins and everything you'd expect from SQL are here in CouchBase as well. So merge, CTEs, subqueries, unions, all that stuff is there. Other stuff. So this is why we often refer to it as SQL plus plus. So it's SQL plus some extra capabilities to deal with JSON. So to deal with nested or embedded documents or arrays, things like that. And then indexes. As a SQL developer, I got to confess I was often very lax about creating indexes. And I generally got away with it. But we're talking about distributed data now, multiple servers and probably a lot of data. So indexes are very, very crucial. And so the same sort of indexing language you'll see in SQL server, you'll see in CouchBase as well. And so you can create an index here on a given collection or a given set of documents. And there's lots of types of indexes. You can create functional indexes, partial indexes, covering indexes, and so on. There's even some flexible indexing options if you want to do a limited kind of ad hoc index as well. And GSI here is called Global Secondary Index. So this is an index that's stored separately from the data, which means it's more easy to, sorry, it's more performance to do that. You don't have to gather index from every node in your cluster. It's just gathering them from the one index service. And in some cases, a query may not even need to access the data directly. That's the covering index I'm talking about. So if you're selecting fields that are all indexed, you can just pull the data right from the index. You don't have to take the extra step of fetching the data. So this is very important when it comes to performance is getting the indexes right. Okay, so to sum up, these are the keys to success to consider when moving from relational to NoSQL. Does your NoSQL database have those features that you need to make the transition easy? So SQL as the transactions and joins. Do you understand the mapping between tables and buckets or tables and collections and schemas and scopes and databases and buckets? And dealing with the SQL dialect transitions, that's going to be something you'll have to learn about like the name of the function for getting difference between dates. It might be slightly different than what you're used to. And then optimize later. So once you've got everything moved over and your queries all set up and they're working, then it's time to think about optimizing. So these are all the key considerations for optimizing. Okay, yes. Right. I thought it was the wrong slide there. So I think I did actually. I think I left that slide in by mistake. Joining and transacting is costly. Even in the distributed architecture like Couchbase joins and asset transactions still do take a performance hit to do that. So you still have to consider how many of those are you going to be able to support in there? And use your well-designed application code already as a guide. So if you're migrating an existing application, you can look at your entity framework. You can look at Hibernate and say, okay, well, can this help me create an optimized object in JSON? For instance, a person contains addresses, right? You may want to merge addresses into the person document, but you may want to keep order separate. If you're familiar with the idea of aggregate roots from domain-driven design, it's the same sort of concept here. So I think in terms of entities, in the scope of your application, does this always have a lifecycle outside of the thing that always references it? So just kind of an unoptimizing thing about can I reduce the number of transactions and joins required, have I met my performance and scale requirements for that? So one way you can reduce joins and transactions is a hypothetical again. So we've got SQL server on the left there, and we've got a couch base approach on the right there. Suppose a user has a checking account and a savings account. They're in two rows of data in this model. With the relational database, to transfer between them, I have to have a transaction because I have to update one and update the other. If, however, they were in the same row of data, you wouldn't have to have a transaction. You wouldn't have to invoke that overhead. However, in relational databases, that's not a flexible design at all. With a document approach, you could have, in my case, you have a nested object called balances. I could store them both in the same document and update them without a transaction and still have the flexibility of adding other accounts in there under balances. So think back to those five steps, right? So the first step, you might be starting your migration with two documents, but combining them into a single document. This is an example of optimization to reduce those joins and transaction pressure on your database. You can also analyze your logical model the best you can. So the key element of optimization here is to determine where and how you can consolidate your data. So take those shreds and put them into individual larger pieces. So let's look at a diagram of a more complex sample here. We're thinking of entities or aggregate roots, like I mentioned with the demand driven design. So maybe is the orange, what I've highlighted in orange there, is that the object? So in this model, orders contain items. So we can just switch to embedding items inside the order document if we want to. Or maybe we can group them together differently. We can embed pay type inside of order and instead of embedding items, we can embed product into each item. So this is how it will look here. The items array that contains an ID and that ID points to the various items and products instead of nesting those within the order. So but now the items and product are grouped together. So that makes them more efficient because they're now together in one piece of data. And another, so basically the consideration to make here is whether to use an array or nested object or even an array of objects. And the way you work with these two options can vary. So you need to know how to plan your, how you plan to access this data. Is there a defined sort order that you are relying on? So if so, arrays might be the way to go. If you need to update each phone number on an individual basis, then nested might be more efficient. So I've got nested on the left and I've got array on the right there just to show you two different approaches. So we're coming up towards the end here of the presentation. We'll get to the demo. But this is kind of the conclusion here is that migrate first, optimize later. This is going to reduce risk. It's going to increase performance. Joins and multi-document transactions are going to be more costly. They're costly and relational. They're going to be more costly in distributed architecture. So a one-to-one migration that you start with might be slower until you get to optimize. So then you can start embedding documents as an optimization. Queries can be converted as long as you notice the couch base and Json differences. And transactions are available. But as you've seen from embedding, you may not need them as much to meet your scale performance goals. SQL queries can be converted. As I mentioned, this is not as onerous a process as you might think. Couch base SQL language called NICL N1QL is very much a standard SQL implementation. So you're always going to have differences between, for instance, transact SQL and PL SQL and PostgreSQL and all those sorts of things. Same thing applies for couch base here. But I just wanted to show you this little graph here on the right. This is just based on user surveys. NICL couch base is SQL language is the top feature why users prefer couch base over a database like Mongo, for instance, where converting those queries from relational is not going to be as straightforward. And or you're going to be learning an entirely new language. And that's more time consuming and more risky to do that. Okay, I want to switch over to a quick couch base demo here. I've got couch base running on my machine here locally. Let me know if that is not legible. I tested this out earlier. I just want to make sure everybody see that okay. So what I'm going to do is I'm going to bring in a sample bucket. By the way, this is couch base server seven. This is currently in beta. But I want to show you some of those collection features that are currently in couch base at beta seven. So I'm going to load in a travel sample data set. This ships with couch bases includes travel data into this travel sample bucket. And so let's see. This is analogous to like the database or SQL server we call them catalogs. Again, it's analogous, but it's not a relational database. And just by creating this bucket, you're getting all the scaling and high availability and flexibility that no SQL typically provides. And you're going to get some of the familiar elements you know from the relational world. So let's dig into this travel sample. I think it's all loaded now. I'll click here is scopes and collections. So I want to show you that we've got two different scopes. We've got a default scope inventory scope. You can create, you know, a lot of scopes here. A scope can be used for if you want to isolate data to a microservice, for instance, if I have an inventory microservice, I might have a payment processing microservice that might be two different scopes, for instance. Or I could use this as a multi-tenancy approach. I could have customer one could be this scope, customer two would be the second scope, et cetera. And because there we have these scopes, we can also assign security roles and permissions to these scopes. So you can give a given user permission to just a scope or read only or permission to just a scope, things like that. This is kind of analogous to the schema level in SQL server. I'm not sure what they would be called in another relational databases, but that's kind of where that is there. I can click on this and I'll see a list of collections. So you can see in this one, we've just got the five collections of airlines, airports, hotel, landmark and routes. And these are kind of analogous to tables. So inside this airline collection are just documents for airlines. And then inside the airport collection are just documents for airports. So again, it's kind of analogous to tables. I can also, let's see, let's go, just go into one of these, let's see, we're going to go to routes, I think. Click on documents here. And these are all the individual routes that are inside that collection. So we've got travel sample, inventory scope, and then the routes collection. And these are all the route documents. You can actually show you want to use, let's edit it and show you that is JSON in there. You can see this is an array of objects in there for the route. And these are all the other fields and values for the JSON object. And those are all the routes. I also want to take you quickly to indexes. So I can view these indexes by bucket. I'm not going to get into this too much, but you can see there's lots of indexes here created for the airline. So for instance, this is an index on the airport collection, just on the airport name field. So I can create other indexes that are on multiple fields. This one here is on, this is a much more complex index on source, destination, and then the schedule array in there as well. So all kinds of indexes you can create there. You need the indexes for the next step, which is going to be the querying. So I want to show you, I'm just going to copy and paste here quickly. A SQL query. This is, again, this is called a nickel N1QL is the language and couch base, but refer to the SQL. So I'm going to select everything from this routes collection where the source airport is CMH. That's my home airport of Columbus, Ohio. So execute that. And I will get 63 results back here. Might be hard for you to see. Let me see if I can zoom in a little bit here. See 63 documents there and the results. And that's the query there. And the results down here are in JSON. Now I could put them in a table view. It's still JSON behind the scenes, but I can at least view them like a table, which sometimes works well, sometimes not so much. It's a little awkward here because we've kind of got a table inside of a table with this schedule. But it's ultimately just JSON there. And I want to show you a join. So notice here I've got this airline ID that says, okay, this route corresponds to this given airline, but showing a customer airline 321 doesn't really give much information. So I want to do a join here to the airline, the airline collection to get the name of the airline. So I'm going to do just paste this in here just to save some time typing with the airline ID source airport destination airport and going to join the route to the airline on the airline ID. And this is the syntax here is basically just saying, give me the, give me the ID, the key of that document and pull the airline name from that document to execute that. And you can see now I've got the airline name in addition to the airline ID. So now I know that this is Delta and so on down there. I can most put this in table view. This looks a little cleaner in table because it's all just flat data now. But there you go. So there's a relatively simple join and catch based and get a lot more complex of course with SQL just as, you know, as you need it to. But just keep in mind that indexes are very important. So you might want to look at index advisor and might say, here's the index I'm currently using and here's some recommendations. So a covering recommendation. This would be if you want to index all those fields. So you can skip for instance the the fetch step of the of the query. I'm getting a little bit too deep into the weeds here, but I just, I just think that's also cool. So that's that's all I want to show you with the demo there. Happy to come back to this. If some of the questions call for it. One more thing like to mention this is a relatively new offering for us, but it's very exciting. It's the couch based cloud DBS. You can go check it out. It's a fully managed data as a service couch based comm slash product slash cloud. We just announced last week, I believe full public availability of running couch based cloud on Azure. So we have AWS and Azure. And more of those to come in the future. But you can go check it out. I think there's a free trial running right now, just some promotions for getting, you know, cloud credit, things like that. So I'm going to check that out if you're interested in exploring that some more. But again, you can also just download it and run it on your local machine like I've been doing for this demo. So lots of ways you can deploy and try out couch based. Okay, I think that's going to be it for what I have prepared. I'm happy to answer some of these questions. I don't know if you wanted to ask them Shannon or. Yeah, yeah, let me jump in here and thank you for this great presentation. There's a lot of questions coming in. If you have questions for Matthew, feel free to submit them in the Q&A portion in the bottom right hand corner of your screen. So diving in here. How do you, how to respond to NBR for optimized when this is a performance ticket client screens, not responding. NBR four optimized when this is okay. You're talking about the level. The I think the one to one to five rating I was talking about is rolled back here to that slide right here. Yeah. Yeah. How to respond to that when a when a performance ticket right so I mean that's really you have to know that going in is after this migration. Is it going to meet the performance needs that I have. You know if it's gotten to the point where the client is screaming at you that may have gone a little bit too far in your in your process without without doing some some proper testing and optimization there. I would say at the number five level is, you know that that how to respond to that is well. It depends on who your customer is what the use case is but the response is we decide to do this migration, you know, to, for whatever reason to save costs or to, you know, just it's going to be a temporary thing and so we get get optimized. You may not even want to go into production with number five level you may want to just try that out as a proof of concept and get people to try it and then they can scream at you there. And that's fine because it's not affecting your end user just yet. It's not going to be affecting some data users or alpha users. So that's that's how I guess I would go with that. Don't don't let it get to the point where the client is screaming at you. I love it. So is it up to the programmer to handle a potential cred anomalies like deleting all copies of a customer and the denormalize document version of the data. I would so in a document database typically you don't have something like a foreign foreign key constraints, and you don't have, you know, so if you if you delete you could delete a document and still have a reference to it elsewhere. So I would say that it could be up to the programmer depending on which programmer you mean right it could be up to the DBA. There are some options in couch base for instance to to program what's called a couch base event or a couch base function, which will respond to documents being updated or deleted. And so you can at that point you can say okay this document's been deleted so go ahead and delete these other related documents as well. You know I think a lot of the applications I've seen don't really do a hard delete of data anymore they'll do a soft delete and flag the data as this is deleted or you know deleted equals true. Something like that. So you'll just have to worry about doing a actual delete of all the data and maybe you can soft delete it and expire it after a certain period of time or something. Matthew is the joints and asset support a final reason to replace transactional databases. You know as an employee of a company who makes a no sequel and non relational database I'd say yes of course but no I would say you know if if you're running into those problems with scale and performance and it comes down to you know we it's hard to scale out of a traditional relational database or these joins are slowing down our query too much. Maybe we can try another approach with non relational. Yes I think it would be a good reason to look into it. I would I don't think I would I would say within any reason that relational databases are going away. And this is going to be the death of relational databases no I don't think so at all. That's what you're asking. What is couch base document memory size limit. For example how big a document can be maintained in memory. Yeah so with couch base that the memory limits is basically a per buckets per node quota. Right so if I have if I say I'll give one gig of RAM to couch base and I have five nodes that's five gig of RAM that my whole cluster has right and you know you can you can give as much RAM as you want to per per node. Now the document size in couch base is limited to I think 20 meg which is kind of a reasonable limit for just the replication reasons. If you get much larger than that you could end up with some very very heavy traffic between nodes for replication purposes so that's the limit that's been set 20 megabytes at a couch base. And can we create actual physical tables using the end one QL there will be a lot of business users used to accessing tables using BI tools. How can they access couch base. Yes, so it depends on what we're talking about if your if your business users are writing sequel still to using for instance power BI or some of the tool like that. You can still there there are JDBC and ODBC drivers available for couch base so you could you could use those. And as I mentioned one of the wedges there at the beginning couch base has its own analytics service built in as well which also works with sequel. So again it's not it's not going to be tables but it's still going to be sequel so they can use to write those queries. So those are both good options there for reporting BI analysis. Yeah so ODBC JDBC and the couch base analytics is I guess the short version of my answer. So does catch me support any sort of join between two different JSON document collections. Yeah, yes, any sort of join. Well, do you have a particular join in mind because certainly I've shown an inner join already and you could do a left join. I mean you can do joins on multiple fields. If there's another specific time to join but but yeah the short answer is yes you can join between collections between buckets between scopes and anything like that. Assuming you have permission to use your has permission to do that. Sure and you mentioned is there an AWS is the cloud environment. What is the operating what's the operating system of this database optimized to run. Optimized to run so I don't want to get into any operating system wars here but couch base is supports windows. Many Linux distributions and I think for developers also supports Mac and and windows desktop. Not not a production environment certainly development environment. So yeah I mean I think most of our customers are probably on Linux but I think many are using windows. I mean and we also will also mention this frame. Mention this or not but certainly Kubernetes and containers. Docker containers are available for couch base as well. And there's been a lot of questions along those lines about preparing couch base to other no people databases but we are a vendor neutral spot. So we do ask you to keep it so just I just let everybody know I'm not going to be asking those questions. You're certainly welcome to contact me afterwards if you want to discuss that. Happy to discuss this. So what is an entity framework to you. So I may have mentioned entity framework during the presentation entity framework I was talking about was specifically in Microsoft dot net technology it's an ORM similar to hibernate if you're familiar with Java. Or you know I don't know I think like Laravel has an ORM built in. So that's that's what when I was sitting in the framework that's what I was referring to. And just touch base support traditional foreign key constraints to support referential integrity between data sets. So there is no foreign key constraints that would this is sort of was referring to back when I said we're not just trying to rebuild a relational database. You know foreign key constraints makes the scaling a little more challenging to do. So there is there is nothing like that in in couch base right now it does support joins right between fields but but not a not a foreign key constraint. I don't and in fact I don't think I can't think of any document database that does support that. Any possibility in future to add couch base in the sequel as a flexible platform added as a layer on top of a real relational databases based applications. Can you say that one more time. Yes. Any possibility in future to add couch base in the sequel as a flexible platform added as a layer on top of rdbms based applications in the afternoon. Hmm. I'm not sure what that would look like and no sequel as a flexible platform on top of relational based applications that would be. I don't know how that would work. I will say that I think I want to get to inside baseball here but I think couch base light, which is the mobile offering for couch base I mentioned that briefly. I believe that uses sort of well for Android and iOS anyway it uses the the databases provided by those devices behind the scenes but that that is not that's not talking about couch base server so couch base server would be. I don't I don't think you would add another layer on top of relational relational database. Now, if the question here is more along the lines of can, can I, you know, split out part of my application to use couch base as a back end and keep the rest on relational absolutely, especially in microservice architecture. That makes a lot of sense and that's what a lot of our customers are doing so a part of their system is really struggling under the load under the scale, like a user profile or catalog for instance and it makes sense to leave the rest the parts that aren't struggling in relational and move just that part to couch base that makes total sense to me. But as in terms of a technical layer on top of relational I don't I don't think that I don't think that tracks. All right lots of great questions coming in here so you know, is there something like row level security. The lowest level security that couch base seven has right now I think is collection level security. Now I will say there are some at the SDK level. So if you're using Java or dot net or whatever, there are some encryption options available at the, again, like you're saying in row level and quotes will be the document level in couch base. There are some ways to encrypts specific fields. But there's nothing right now that I think you can say for this given row only, you know, only has that this user only has access to this subset of rows for instance so the best way to do that would be to create a separate collection I think and assign permissions to that collection. And I think we have time for a couple more questions here. Any other integration best practices to adopt for service now or Broadcom monitoring tools. I'm not yeah I'm not familiar with service now or Broadcom monitoring tools. I think the last thing I saw with monitoring the couch base had to do with Prometheus who is a Prometheus provider for couch base that's out there I think there's some integrations for some of the more popular monitoring tools. I don't I don't know if I'm stepping into vendor territory, but there are some there's some monitoring tools out there I don't know about those tools specifically I have to make a note to myself look it up. And can you speak to any known healthcare invitations with healthcare interoperability resources. You're talking about fire FHIR yes so there's actually a blog post on couch base I think we have at least one customer who has implemented fire. And you can go check out some of I don't know if this mentions a customer specifically, but certainly shows the implementation of fire couch base is ideal for that fire is a JSON based healthcare kind of I don't know if I described as a schema but but kind of a way to represent data in this on for, for, like you said, interoperability with the healthcare providers so yes couch base is very involved in that. Definitely, you want to take out the URL on your right now but certainly if you Google couch base fire that'll be the first result for you there. All right, and I think we can squeeze this in pretty quickly does couch base like you speak away. So I, I stopped sort of saying that earlier, but I think it depends on the operating system. So I think it does use it behind the scenes, at least on Android I'm not sure about other offering systems. But it's not something where you would want to actually go into the SQL light directly. It would, it would that would be, you know, a potential for causing corruption things like that. I'm going to keep squeezing in some questions here. We've got a couple minutes, two minutes left. Is there any data masking ability. Right. So I think with if masking you mean like encryption of individual pieces of data that yeah I mentioned there's some client side options that I know at least on that and Java, there may be few other SDKs that can support that. Certainly in terms of a larger, like encrypting data at rest for instance, there are some, well certainly data in motion is encrypted with couch base so if you're querying it's all over TLS, things like that. But data at rest I believe we have several partners I believe and certainly the cloud options as well. We'll do some of that for you. If you're interested in encrypting the entire data set at rest. So Matthew, thank you so much for this great Q and A and for the presentation. And thanks for sponsoring. We always love it when you guys join us. And thanks all of our attendees for being so engaged in everything we do, but that is all the time that we have for today's presentation. Just a reminder, I will send a follow up email to everybody by interday Thursday with links to the slides, the recording of this presentation as well as how to get more information from couch base. Thank you so much. Thanks everybody. Hope you all have a great day and stay safe out there. Thank you, Shannon. Thanks everybody for coming.