 Hello and welcome. My name is Shannon Kemp and I'm the Chief Digital Officer of DataVersity. We would like to thank you for joining the latest installment of the monthly DataVersity webinar series, Advanced Analytics with William McKnight, sponsored today by CouchBase. Today, William will be discussing assessing new database capabilities, multi-model. Just a couple of points to get us started. Due to the large number of people that attend these sessions, you will be muted during the webinar. For questions, we'll be collecting them via the Q&A section, or if you'd like to tweet, we encourage you to share highlights or questions via Twitter using hashtag ADV analytics. And if you'd like to chat with us or with each other, we certainly encourage you to do so. To open the Q&A panel or the chat panel, you'll find those icons in the bottom middle of your screen for those features. And just to know the chat defaults to send to just the panelists, but you may absolutely change that to network with everyone. And as always, we will send a follow-up email within two business days containing links to the slides, the recording of the session, and any additional information requested throughout the webinar. Now let me turn it over to Rick from CouchBase for a brief forward from our sponsor. Rick, hello and welcome. Sorry about that, Shannon. I was muted. No worries. We're not seeing your screen right now. You stop sharing. All right. Let me get that back on. All right. Good to go. Looks good. All right. So thanks a lot for that introduction, Shannon. I'm going to take this opportunity to quickly tell everyone a little bit about CouchBase, which is a multi-model NoSQL document database. So I'll also be talking a little bit about the CouchBase analytics service, which helps our clients perform analytics on real-time data. And then I'll talk a little about some use cases and customer successes. All right. So why CouchBase? Well, CouchBase is a little bit different. We're a single platform that includes capabilities and services that support different functional requirements in a single multi-model database, which obviously helps to reduce vendor sprawl. CouchBase is fast. The memory-first architecture, elastic scaling, automatic replication and synchronization provides fast performance from the edge all the way to the cloud, even as we scale. We're also flexible. Data is stored as JSON documents. That gives developers the flexibility to dynamically update the data structure as the application needs change. So as the application grows and the needs change, you can update the schema as the data is read into the database, as the data is read from the database, which is much faster and a lot more flexible. CouchBase is also familiar. It's designed to make the transition from the relational to the NoSQL world easier. So our customers can move from their relational traditional databases to a NoSQL database, but they can still leverage skills like SQL. And they can also duplicate the relational database models within CouchBase. For example, schemers and tables in the relational world are available in CouchBase as scopes and collections. And last but not least, CouchBase is also affordable. Capabilities like multidimensional scaling allow users to configure clusters based on the workload. So you can configure your cluster based on the type of workload you're running, which helps to minimize wastage. So you're not wasting resources, you're utilizing all your resources, which in turn allows you to be more effective and more cost effective. Alright, so we've got three different options for CouchBase. The first option I'm going to talk about is CouchBase server, which is a self managed offering, and it's available on all the cloud vendors. This gives our options a lot of control so they can maintain control over their data. And it also gives them the custom ability to manage their workloads as they wish. And then they can leverage their existing DevOps teams and DDAs because those skills can transition over to CouchBase. And they can also implement their own management strategies and tools. And generally, our customers deploy CouchBase server using Kubernetes. Then we have CouchBase mobile, which is an offline first data access database, which allows peer to peer syncing. So the IoT devices can peer to peer sync with each other, and then sync back up to CouchBase server when they have access to the server. We also offer automatic data conflict resolution, which is very important because as you gather data from these edge devices, it can sometimes be difficult to determine which data is the most current. And then we also CouchBase Capella, which requires no administration, it's a D-Bass offering, and it's very easy to start and start building remote applications, meaning mobile applications, and also conventional database applications. The setup is fast and easy, maintenance and upgrades are handled automatically, and that allows for a faster time to market, and then the returns on TCO. Basically, you can get performance from less nodes with CouchBase, which helps to drive down your environment costs. So let's talk a little bit about the CouchBase Analytics service specifically. This service allows isolation of transactional data from analytical data, which helps to speed up the processing on both sides. So you have separate nodes for each group of data, and therefore each processing group is done much more quickly. Each service is able to perform faster. The separation means that the transactional jobs and analytical jobs can run in separate environments. So again, you can do things like complex analytics, joins, aggregations, etc., on a shadow copy of the data that's in your operational environment. And both of these services support SQL querying. So that helps to shorten the learning curve because you don't have to learn a proprietary query language like you do with some other platforms. The data in CouchBase is saved as JSON documents, which means that developers can assign structure when the data is returned from the database. So ETL required. You don't have to transform the data as it's being ingested into CouchBase. You can assign a schema as the data is read from the backend database. Also, a lot of our processing is done in memory. So which means CouchBase is a lot faster. And then all our nodes are equal. We don't have any management nodes. So all nodes are available for work, no node wastages, which obviously improves our efficiency and ultimately lowers TCO. So what are some of the requirements for an analytics platform? Well, it needs to be timely. Well, most organizations, modern organizations are data driven. They need to be able to analyze data and extract information very quickly. And they need to be able to make decisions on the data that they're generating and then analyzing. So CouchBase helps to solve that problem by making your operational data immediately available, which improves your flexibility. I talked a little bit earlier about the ability to assign a schema when the data is read as opposed to when it's ingested. Again, this helps to improve speed. And then CouchBase is also scalable. So both your analytics service and your data service can scale up as needed. This diagram does a pretty good job of explaining the CouchBase architecture. So as I mentioned earlier, the analytics data is housed on separate nodes than the data service data, than the actual data from the operations of the company. So we've got separate data nodes for each set of services. As you can see here in the middle box, you can run your operational apps on the data service and then you run your analytical apps on the analytics service. Users can also join data from disparaged data sources. So for example, here on the lower right, you can join data in Azure or in S3 to your CouchBase analytics service and then query that data from the analytics service, which is obviously very important. And it's a very good perk to have, right? So you can have data sitting in S3, even say CSV files, you can link that data to your analytical data in CouchBase and then search both those data sets from within the analytics service. And then we also have connectors for BI tools to visualize your data. And we have a native CouchBase Tableau connector, for example, that you can use to visualize your data in CouchBase. Alright, so let's talk about some customer stories. I'm going to focus on this e-commerce example. Basically what we have here is a need for real-time marketing. Lots of organizations want to be able to market in a timely fashion to their customers. So they need to analyze the dynamic data and they need to do it in as close to real-time as possible with the goal of creating professional marketing campaigns and then reducing the time it takes to execute their ML models on their analytical data. So in practical terms, CouchBase is able to help our customers reduce the time it takes to create these targeted marketing offers. And then we can enhance the ability to gather insight from their customers and determine trends. The whole point is to make it faster for these companies to react to their customer needs. Alright, I'm going to run through these examples because we're running a little bit low on time. One customer that uses CouchBase analytical service is Domino's and again they use it for real-time marketing. We're able to help them reduce the time it took to run their ML marketing models. So they have models that they need to run on their operational data in order to determine how to market towards their customers. We have that data separated into CouchBase Analytics. They can run their models on the CouchBase Analytics using all user-defined functions and get virtually instant information on their customers and their customer preferences. Another example is that of the Cincinnati Reds. Again, I'm going to just quickly summarize this. They needed real-time analytics also and we're able to provide that for them and help them to increase their customer retention and also the reporting capabilities within the organization. Alright. And here we have a few customers who utilize CouchBase. As you can see, we kind of run the gamut. We've got customers across various sectors. We help them to improve their user experience. As I mentioned before, reduce cost and then speed up their time to market for their analytical applications. Alright, that's about it. Thanks a lot, Shannon. I'm going to turn it back over to you. Thank you so much for kicking us off and thanks to CouchBase for sponsoring and help to make these webinars happen. If you have questions for Rick and about CouchBase, you may submit them in the Q&A portion of your screen and he'll be joining us in the Q&A at the end of the webinar. Now let me introduce to our speaker for the series, William McKnight. As he has advised many of the world's best known organizations, his strategies form the information management plan for leading companies in numerous industries. He has a prolific author and popular keynote speaker and trainer. He has performed dozens of benchmarks on leading database, data, lake streaming and data integration products. And with that, I'll give the floor to William to get his presentation started. Hello and welcome. Hello, Shannon and welcome everybody. Thank you so much, Rick, for kicking us off here. It's a pleasure to have a strong database that's in the area that I'll be talking about today. Here is the sponsor that being CouchBase. Today we're going to talk about assessing new database capabilities. I guess they're still kind of new. And that is multimodal or multimodal. You'll hear me say probably both things you'll hear both things set out in the industry. Essentially it's the same thing so something to get used to more terminology. So when did this all start this idea of multimodal well in 2012 the term multimodal database was coined by Luca Garooly. He's saying that wrong during his keynote presentation at the NoSQL Matters Conference in Cologne, Germany. And the beginnings were traditional SQL database solutions, starting to add support for the JSON data type. I remember this as NoSQL became more prominent widely adopted in the early and mid to 2010s. I think it's a pretty important topic why because is it are we destined that our architectures will forever get more and more complicated and will be adding more and more vendors and more and more products to it. It sure seems that way. So whenever I see a little pocket here of, well, maybe we can do some consolidation around that I really look hard at that because our architectures are getting pretty complicated out there and I think I've spent some time in in past sessions talking about that and all the layers of the architecture and so much more than it used to be so where we see an opportunity for a database to actually handle multiple data types. I think it behooves us to check that out. By the way, you gotta gotta buyer beware here, because who's not multimodal out there multimodal out there. I went to four conferences last month, and mostly put on by database vendors, but either way there was a lot of database vendors at them I always seek them out and of course I have my usual briefings that I'm taking from database vendors leading up to this presentation I'm asking them about their multimodal capabilities and oh yeah they have they have all these multimodal capabilities out there right. Now I think that some of the models that are part of multimodal. You have to have certain a certain level of capabilities for really to be enterprise grade, and really for it to make sense within your enterprise for you to even consider it for a true enterprise. There are some capabilities that you need within all the aspects of the models that you are going to use in these databases so buyer beware be careful. Everybody's multimodal to some degree but we're going to talk today about some criteria that you want to apply to the models to make sure that it's good enough for you. That's a little bit more about me. I've been introduced. Keep in mind that a lot of what we'll be talking about today has to do with coming from the no sequel marketplace. No sequel now not all of the formerly known as no sequel vendors are still embracing the term some are, but I like the term so I'm going to use it. And it has to do with those databases that are not straight up sequel against relational databases alright. So it's a little bit about McKnight consulting group we offer strategy training and implementation for making data into the asset that it needs to be for your organization and we've done this for several global 2000 organizations as well as the mid market to high degrees of success. So if you have any questions about your architecture technology, your roadmap, your organization. Let me know maybe you can help. And these are some of the clients that we have done this for a lot of finance a lot of health care but really we cover all verticals, as far as that goes now you're out there making a lot of decisions today. I know you are. And I know it's more complicated than ever to be in this business, because there's, there's so many failure points where if you get one layer of the architecture wrong out of 10, you're still bringing the whole thing down or at least limiting the capabilities a bit and it's like I said more complicated than used to be there's an unprecedented variety of data store choices to meet the needs of varied workloads there's no one size fits all that chip is sale, maybe it'll come back. Now I see glimmers of hope for more consolidation beyond what we're talking about here today. That would be nice but today is not that day enterprises have many needs for databases including cash operational data warehouse master data ERP and so on plus plenty of there's plenty in the operational area by the way that we can itemize out here but we all know databases the enterprise without them I don't know where enterprises would be still working with flat files I guess. While vendor offerings have exploded in recent years and do time frameworks will integrate components and what amounts to a single offering yeah that's what I was talking about a single offering. So if price performance offerings for adjacent workloads in an enterprise have materialized and I'm here to say that buyer beware but I believe that some of them have and let's take a look as we go ahead. No one size fits all once again. Just to pick on some of the data types we're talking about here cash cash database, which we don't always think about it's not a straight up relational database right. It's the smartest database and the trade offs of expensive fast in memory storage and slower storage. Internet as an internet example might indicate it's selection is all about access speed, it must have some millisecond response, and it must hold its speed as a high number of users pound the database now I'm talking about things as if you must do this you must do that always put the lens of your enterprise on it. But I do believe that once you get into one of these areas you're going to go where a lot of my clients have gone where a lot of the industry goes in terms of, you know what you need so multi model. It's a pretty loose term out there. So, let's try to put some flesh on that there are many data types. Web crawlers, open link data in no particular order here. Jason. Yeah, pretty predominant like I said, a lot of this multimodal business came because there were Jason databases emerging and relational databases said hey, we can take that on as a data type and do some things with it. XML kind of same thing there. XML not as big in favor today as Jason but still there of course documents. We'll talk more about that as we talk about document stores binary files being your video audio that sort of thing. Graph. We'll talk about that log files log files is is huge log files list actions that have occurred and web servers maintain these long files listings, every request made to the server with log file analysis tools. It's possible to get a good idea of where visitors are coming from, often they return and how they navigate through a site. And hopefully the rest of these are pretty straightforward. Let me mention web crawlers that might be new to some people it's a internet bot, which systemically browsers the web, typically for the purpose of indexing the web. So like another term for it might be spider. So anyway, these, these are some data types that we're trying to get a handle on within the organization, and we can just put it in a good old relational database good old legacy. So why rows and columns database. Why do we need something else why did the no sequel phenomena emerge, and this must have been by guessing and putting a year on it right here now that maybe a dozen years ago or so. When this really emerged and gave us some real options here. So why no sequel why not just straight up relational databases. Well, you get more data model flexibility. You get more services as a data model, there's no schema first requirement you load the load data first, and load what you got, and hopefully you don't make a mess of it, and load too many incongruent data types into one of course there's still some design left and no sequel databases people forget this you get faster time from date acquisition relaxed acid that's atomicity consistency isolation durability. You know all those things that you need for good transaction processing. Incidentally, some multi model databases ensure acid guarantees across all data stores which is much harder to guarantee an individual databases. So, even though I say yeah relax acid that's a good thing. Some vendors do it and do it well and it doesn't really impact performance anymore, but it used to be used to be that none of these had acid low upfront software and development costs. It's true for a lot of things beyond no sequel when it comes to the cloud programmer freedoms fault tolerant redundancy and linear scaling. So there are many ways that no sequel design solve problems more efficiently than traditional SQL and relational systems. In many cases, those systems require complex processing that can be avoided by no sequel systems. The use of document orientation for example avoids one of the most complex pitfalls of application, the mapping between objects and relational systems. This is known as the operational relation object relational mapping problem. Okay, why no sequel let's let's just keep on here. These are some of the things that were fundamental to no sequel and that's distributed file systems. Of course, this is true for Hadoop. This is true for cloud storage as well. We have multiple nodes and blocks placed across those nodes strategically, we won't get into the algorithm too much here today. But the policy is based on a copy written to the node creating the file written to a data node within the same rack, and a third copy written to a data node in a different rack usually three. And we can change that of course, I usually want to leave it at three. And the idea being that you are now down to 0.00. I don't know how far I want to go out here. I don't have the ability that any that you'll lose any data, and that's obviously very important underpinnings of the entire DFS ecosystem, your design goals you want to scalable to thousands of nodes. Assume failures are common they're going to be more common with this type of commodity hardware target this towards small numbers of very large files. And right once read multiple times. These are highly scalable that to thousands of nodes and massive files. We're talking hundreds of terabytes to petabytes that can now be stored in these no sequel databases, and they don't use mirroring or raid they have other mechanisms, namely the one I just showed you the two dedicated blocks to deal with a wide variety of failure types, and the secret sauce of the whole thing is that they will quickly restore their fault tolerance and they will quickly restore. If something goes wrong, so load balancing fast access and fault tolerance for underpinnings of the entire no sequel movement. Now let's get into some of the no sequel databases and capabilities and I hope that you are thinking today about these capabilities in context of the database that you're working with. Is it truly multi model. Does it have these type of case these types of capabilities if they even claim graph. If they don't claim graph, and you need graph, and you probably do. That would mean you probably need a separate graph database okay so therefore there's one less opportunity for great consolidation there. Now the property graph, there's really only a couple out there, Neil 4j is kind of dominant in the space, but it's, it attacks some of the same problems that we're going to see on the next slide when I talk about the semantic graph now. This is a bit of a property graph model, I've given a whole advanced analytics presentation on graph databases. So, you know, I'm, I have high affinity for them, a lot of use for them that I can find in an organization you can to whenever there are relationships whenever they're important, because you can relate nodes by type and direction, and you can have name value properties. So, this is a property graph here, you see there are, why it's a property graph, there are properties on the relationship so this person named and apparently in the upper right, owns the Volvo. She owns it that's the property, the person in the upper left Dan drives the Volvo so drive is the property and then there are, there can be other attributes assigned like he's done that since January 10 2011 etc etc that's a property graph and then there's a semantic graph and they are different in how they attack largely the same problem although semantic graphs tend to be a little bit more scalable. And obviously the storage is going to be different there are more limitations I would say on a semantic graph and there's plenty more vendors of semantic graph and this is really taken off in the past few years so look for your need for this in your multi model database look for your need for this in the enterprise the data stored as what's called an RDF triple store, which is three elements working together. It's a semantic databases only work with RDF. It is users of third party data in RDF that was the initial target market, and now it really works across all data sets and sparkle SPAR QL has the equivalent functionality to the language that we might use in the property graph like cipher and examples of semantic graphs like our Lego graph and Eureka I think linked open data is linked data that is open content, we don't need to be concerned too much about that, but if we have graph problems in our organization. We need a graph, we need a graph solution, don't try to force fit this into your relational database and I think that's part of my message for today. Don't try to force everything into the relational database unless it's a multi model database, and it can do all these things. Yeah, I'm leaving a lot of judgment on the table for you because, well I can't help it. That's the way it is this is a complex job and, frankly, I think it's only going to get more complex for quite a while, the way things are going. The mass entry of vendors into the space that we're in and so on exciting times. So databases are multi model when they can be either, for example, a key value store or a document store. And that's probably kind of a low hanging fruit when it comes to multi model key value plus document or document plus key value. And I'll get into document and key value explain what they are. Key value example might there's a key, and then there's everything else you have to access everything by the key, the timestamp for example in this in this example timestamp is the key couch base is a multi model database for example, because it supports multiple data models including key value document graph and search and others. A lot of what I find in terms of what drives which platform to use which data model to use. It depends on the data type. And, yes, of course it does depend on the volume of data that you have for that data type. The more volume might drive more robust solutions, you can get away with less robust solutions if you have the lower levels of volumes of the data types that aren't your core data types. But, generally speaking, if you have CSV CSV or web logs, that's going to be an either a column, or a document store, and I'll explain all these as we go along documents and a document store, Jason document store, metadata catalog column or document, key images and documents, key value, and RDF and link data as discussed a graph store so your data type will drive a lot of the model that you need to use again is try not to force fit things you don't have to force it anymore. Let's start with keep well continue into key value stores. I almost said start with because this is sort of where a lot of the no SQL marketplace began. This is no sequels OLTP equivalent. As a matter of fact, as a side note, I will say that I believe that a lot of these no SQL databases are taking the reins from relational databases when it comes to operational databases, operational databases of course they're still databases are still relational for the for the most part but I do see quite a bit of modern applications using no SQL databases in place and maybe that'll be clear as I talk about this a little bit. OLTP is pretty simple. There's not a ton of functionality it's fast. It's an associative array data model. There's a key and there's a blob pair I say blob even though it can be columns so to speak, but the columns are really not accessed individually. And for many that's a knockout factor that's a problem, but if you need high levels of speed key value is for you, you retrieve value given a key, all accesses by the key. The difficulty is that your application cannot run arbitrarily selection arbitrary selection queries like select splat from table. And so it needs to know where to look for objects in advance. And that can be pretty limiting but if you're, again, trying to deal with high volumes of data, trying to access it fast maybe you can deal with that these are databases like react redis mem cash Berkeley hamster Dynamo project of Alderman open source and Amazon Dynamo DB. Yeah, all the major hyperscalers have products in this space. All right, key value stores continuing on here they are technically horizontally scalable fast, very fast resilient to cluster failures, simple, and all nodes are equal there can be thousands of TPS per CPU core you can use indexes to look up keys. React for example uses solar. All data stores do a hash of the key to determine the location in the cluster, which obviously that's part of the secret sauce of being able to find the rest of the record, based upon the key. It's good for any single objective unstructured data blobs speed wherever speed is important like in multiplayer games online games period shopping carts, it can handle part of that processing pretty well. Geo localized processing where you're trying to take advantage of exactly where a person or thing is. It's obviously transitory speed when you can't be down. These are very resilient databases, which is great. And they give you a lot of speed key values often a good choice for serving advertising content to many different web and mobile users simultaneously with low latency. That's probably the main use for content of this sort, e.g images or text can be stored in key value using unique keys generated either by the application or by the key value store itself. And these can get up to 100,000 writes per second. Wow, pretty fast now a lot of key value stores or at least the ones that started out as key value stores are, you know they've moved into multi model, and I'll have something to say about that as we go along here. Now it's time to define the multi model database. Multi model database, it's a single integrated database that can store managing query data in multiple models such as relational document graph key value column store and cash. It is the opposite approach to polyglot persistence anybody hear that anymore. To polyglot persistence that was the idea that we should use multiple databases in a workload, each to its own to its own unique value proposition and stitch it all together, which yeah we still do that for sure. But we do want to limit that the multi model idea is not new. So we have a database used to support multiple workload categories, albeit on a limited basis. What is new. Why I put it in the title is because if they're really capable today, and make sure that what, make sure that your selection, and your expectation of how a solution will work, one of the models is in line with what the capabilities really are, and that they're not just putting band-aids on under the hood. Like for example I know we passed it already but when it comes to graph. It's one thing to be able to access data via graph and use graph algorithms on data. So if under the scenes it's not stored as a triple store, that's going to perform really slow. And at some level of an enterprise workload that's not going to work. Document oriented databases couch basis a document oriented databases database. Key value stores, in other words, they have all the value, all the value propositions of key value stores, but they have added capabilities as well, like the ability to nest sub documents. So the data here stores JSON or XML, largely JSON with the tree light start structure, and your ability to group data together more naturally and logically so key value stores plus different things like the values are queryable. Remember I harped on that as a challenge with key value stores, the values are queryable materialized views yes indexes yes documents are addressed by your eyes, these support rest interfaces. Good for things like event logging content management, real time web serving and e commerce. So it replaces SQL abstract programming, where you frequently in SQL you don't have values for things but you still have to store a placeholder for in the relational model. And frequently that's going to be a null or or an empty column sometimes zero, what have you. We don't have to do that when it comes to document oriented databases. The idea is store all data together documents are self describing hierarchical tree structures, unlike key value stores the value part of the field can be queried and that's a big plus. They're good for I mentioned some things, but here's some more they're good for a lot of things they are the general multi purpose. And so these are, I would say, the dominant model for no sequel, and a great place to launch off into the other models as a vendor of a document oriented database so the strongest, the strongest let me say this right the strongest multi model databases come from a document oriented background. And in case it's not all clear. This is a document example, baking recipe, the type mama's cornbread you see the ingredients you see it's nested blah blah blah we won't be making that cornbread today but you can see what some of the ingredients are for that if you do make that let me know how it goes couch base has elastic scalability always available. Global deployment, some of the things that Rick mentioned before it's a key value store where the value is a JSON document which makes it a column store. If not JSON it's a 64 byte string, the JSON is eligible for indexing, and there's no extra step of taking the database down to change the schema or anything like that. There is a I could go into this in detail I'm going to try to hold back here but I'm excited about document database okay and the hash there's a hash part where they have an even distribution of the data. So the data is distributed based on the hash there are data buckets and they have this. They have this algorithm that there's so many buckets and it's mapped into a certain number and blah blah blah, and it's really a beautiful thing how the data gets distributed. You can have multiple no SQL solutions working together this is that poly got persistence approach, and you would use a key value store for the shopping cart and session data document or column store for consuming completed orders. A good old relational database for inventory and financials and a graph store for customer relationship for marketing. You could use a good multi model database, and that's part of the point here today the mission critical store in all of this frequently just does compact come back to relational key value pairs are good for fast and simple document oriented for flexible models and modeling and column stores for time series data let's get to that column stores are data models, also known as big table with frequently with column families where you can group different columns together and that data will be stored contiguously it's kind of like a column database in terms of how that part of it is handled. It produces used for querying and processing very light schema, but it's pretty close to a relational database, as far as that goes far as how you handle it. It's optimized for column wide operations like count sums and averages so if those are very important part of your processing, you might consider a column store. Cassandra a big one their H base and hyper table among others. This is good for large amounts of data data that needs compression event logging content management systems, the data model support semi structured data. And like I mentioned before, time series data, the data where you want to keep either the last X minutes or the last X generations of the data and then you don't want anymore. It's like weather data for operations location data of things sensor data if you're grading out the readouts and it's not data that you necessarily want to keep around forever although, although I do find that some organizations are flowing the data from column stores to a data lake for more history I won't say all time history but for more history so that they can apply artificial intelligence to that data, and hopefully improve their operations column stores utilize block compression, using different compression algorithms like gzip and LCO this is best for this is good for a multi model launch I would still say that the document stores are good and even better, but key value stores may not be the best place to launch off into multi model. There are a few reasons for this key value may not offer the best support for storing and querying complex data structures key values designed for high performance and low lazy. So it may not offer the best support for data consistency and transactions. And finally, I'll say key value stores are in memory data stores. So may not offer the best support for data persistence. All things you want in your multi model database is like this is an example of column store. You got a row key. Okay, that's kind of there. And then you have what we're showing you here are column families, giving you some definitions here we've got column keys and column values. Looks like we got a profile we got some orders going on for this particular instance can get all related information using a single record ID. Usually they use a random mechanism for the row ID which gives you faster right. And then you have a slow read, which is usually what you want in this age base does things a little bit differently. I believe this is a Cassandra example, each base has different properties at the column family level like the number of versions the time to live the compression technique, whether it's in memory or not. And always is known and specified definition time so I don't want to say you can't change this, but you do not specify the columns at definition in Cassandra, for example, you can you do that when you are writing the records. And we won't go into it a lot more hopefully I've given you some idea about column stores though now multi model databases that support. I showed you should at least support them to the level that I just showed you and then I'm going to give you some things now to look for in multi model databases I got a couple slides for that. I want to see an excellent implementation of multiple models, not an excellent implementation of one model and a half implementation of the rest, because that can really get you into trouble. So counting on your multi model database for let's just say key value support and it's really not there at an enterprise level. That can get you into trouble. You want a single copy of the data. You don't want the data have to be replicated for each model you want model change propagation you change it one place gets changed all places, you want it to work in the microservices world, and you want some millisecond response time. It should be slow. It's non trivial to cover the features of an excellent implementation of any model, let alone multiple models. We're talking years potentially of development work to go from one to the other in a good way. While it's understandable if a platform has a first or an anchor model, of course, look carefully at the implementation of the second and subsequent models to ensure compliance with the best practices of those models. These include high compression, no failures without service disruption, a cost based optimizer, etc, things like that. The ability to scale a multi model solution from one to multiple is crucial. Since a multi model database will typically be deployed with one model most prominent at a time and then you will work your way through other models at least that's how I've seen it done. Almost all of the models in a multi model implementation, all the ones I'm talking about here have been around a decade or more and consequently have a set of factors that make it work well. Look at those for sure. Here's some more things to look for in multi model globally distributed multi region deployments. If you're a cross model data processing language and optimizer you shouldn't have to change your language that you're using to interface with the data, because you're using a different type of model. You want an edge capable database today because there's a lot of processing we can put at the edge. Jason flattening. Yeah, good old Jason flattening without data explosion as a result, and universal indices. Wow. Yeah. Like I said, not easy globally distributed applications need a database that can distribute can distribute globally and transparently replicate the data anywhere to the center that is closest to its users. There's been distributed multi region deployment, a multi model database attempts to embrace the challenge of cross model optimizer by developing a unified query language to accommodate all the supported data models. And I dare say that the optimizer is a big part of the work at moving into multi model. So if a single model optimizer needs an optimizer, a good multi model optimizer will be more difficult to build than a single model optimizer, but data virtualization vendors have overcome essentially the same problem optimizing queries across data vests. So, there is hope. Finally, here's some other things to look for when it comes to multi model emerging technologies, the use of artificial intelligence. And for this to, I want to see artificial intelligence in the solution. I want to know the strategy that the vendor has deployed because I'm buying in, I'm thinking it's very important, and very important to our future to the future capability of the software, etc. So there's a separation with data catalog platforms, because they're emerging pretty strong a robust user experience, of course, we can take the word robust multiple ways but take a look at the UI for sure in a multi cloud or multi, or excuse me, a cloud native type of application multi model databases should help companies leverage the effort they are putting into populating their data catalog platforms for data sourcing rule enforcement security and so forth. And as far as that robust interface should have a modern updated appearance that helps business users easily accomplish necessary tasks. I hope I've taken you from at least at this level through the various models that make up a multi model database, give you some criteria to look for in each one, maybe scratch the surface a bit about where some of them might be applicable within your organization. Then I gave you some criteria to look for in your multi model databases, if you are counting on a database for this capabilities and I encourage you to do so. This brings me to the end of the formal presentation. Feel free to lob some Q&A Q and some questions in there, and I'm going to turn it back to Shannon and Rick and I will pick up your questions. William, thank you so much for another great presentation. If you have questions for William or Rick, feel free to submit them in the Q&A portion of your screen and just answer the most commonly asked questions. Just a reminder, I will send a follow up email by end of day Monday for this webinar with links to the slides and links to the recording. So diving in here. So for a use case such as a road network to graph, as a road network, a graph database would make a good digital twin. How would this work when existing databases for managing the asset on the network are relational? Wow, that's a specific use case there. I'll take a first pass at it and then bring in Rick on this one. I'm not really familiar with that type of, I think you said a road network, I'm not familiar with that, and what that means so It's very literally a road network. I believe work works for, yeah. So ROW, ROAD. Oh, okay, road. Yeah. Well, I mean, there's a lot of functionality when it comes to transportation and with graph databases. A lot of the points of a lot of the direction setting that happens like with a Google Maps or something that's all about a graph database, where there are various points and it takes to go from point to point. When there's 100 ways to go from point A to point B, it can guide you on the fastest path because of a graph database that adds up all the point to points that can possibly get you there. And so anytime something is in transport, definitely you want strong graph capabilities. I will say though that since the the question or through in the relational component to this. I'm not sure there's a lot of great multi-model graph plus relational capabilities going on. I'm even scratching my head trying to think of one. So that may be something that we still have to do it with multiple databases, polyglot persistence. And keep an eye on what on the space to see what develops in that area that that might help you consolidate. Yeah, for my two cents, graph databases are generally used as for link analysis. So as you mentioned, William, for trying to find the shortest distance between two points, for example, graph databases are good for that. In terms of the relational world, I guess you could do something similar in relational databases with your table link ins. But that would tend to mean that you're linking a table back onto itself because you probably have the location saved in one table. So that might be a little tricky in their relational database world. Yeah, it would be self-referencing tables. And, and that can be problematic from different perspectives. What else was I going to say about that before we move on? Well, and he added that we're developing an assessment asset management data standard. Well, asset management usually that's pretty, pretty much your rows and columns type of need there. So that would be a relational database. I will add that a lot of relational databases, even the real legacy ones have added capabilities for graph. But I think that they are kind of what I said during my talk, which is, okay, they may give you the graph algorithms, but they don't really store the data as a triple store as a graph database would. So you will suffer comparatively with performance and keep in mind that graph databases are yes, they are about the linkages between physical items, but they also can draw out relative importance of various nodes on the network. In this case, maybe various assets that are out there on the road. So I think I think you're the, they're onto something that, you know, graph databases are definitely going to help their application. Yeah, what I could add to that in terms of asset management, that may be a use case that coach base could could work well and meaning you could have a document and then all the related documents are nested within that that parent doc. And then that would be easier to search back from the database and also it'd be a lot faster. So you could have nested objects within a primary object. Very nice and everyone's super quiet today. I don't have any current additional questions. But, you know, Rick as line was going through his presentation anything that you want to add that came to mind. Yeah, just from a coach base perspective I think William did mention this also. There's multiple ways of accessing the data so you can access data coach base from full text search you can access it from the analytics service from the data service you can do key value look ups you can run SQL query look ups. So just, I just like to emphasize the fact that there's multiple ways of accessing your data within our platform. Well that is all the questions that we have thank you both for such a great webinar and event. And as a reminder again to all the attendees I will be sending a follow up you want and a day Monday with links to the slides and links to the recording. Really appreciated everybody thanks to catch base for sponsoring and helping to make these webinars happen. Thanks y'all. Thanks, thanks William thanks Shannon. Bye bye.