 And here we go. Hello and welcome. My name is Shannon Kemp and I'm the Chief Digital Manager for Dataversee. We want to thank you for joining the latest in the monthly webinar series, Data Architecture Strategies with Donna Burbank. Today, Donna will discuss emerging trends in data architecture. What's the next big thing sponsored today by DataStacks? Just a couple of points to get us started. Due to the large number of people that attend these sessions, you will be muted during the webinar. For questions, we will be collecting them by the Q&A in the bottom right-hand corner of your screen or if you like to tweet, we encourage you to share highlights or questions via Twitter using hashtag DAstrategies. And we very much encourage you to chat with us and with each other throughout the webinar. To do so, just click the chat icon in the bottom middle of your screen to activate that feature. And if you'd like to continue the conversation after the webinar or follow Donna further, you may do so at community.datevercity.net. As always, we will send a follow-up email within two business days containing links to the slides and the recording of the session and additional information requested throughout the webinar. Now let me turn it over to Jen from DataStacks for a word from our sponsor, Jen. Hello and welcome. Hi, thank you and hi, everyone. I'm Jen Yanimetsu. I'm the director of product marketing at DataStacks. I'm gonna go through a very quick presentation that goes through how to simplify the complexity of data and advanced workloads in today's applications. And we think generally that in today's world, our applications need a data management platform that is capable of processing mixed workloads, doing that efficiently. And so you can focus more easily on getting value and a higher level of intelligence from your data. So we will focus in on no SQL-like platforms. In this case, Apache Cassandra. We think this is a critical modern data-based foundation that will serve as mixed workloads very well. For those of you that don't know Apache Cassandra, it is an open-source developed project. It is a distributed data management system and is truly the best in class for zero downtime. And the reason why they can claim this or the project can claim this is really due to its masterless architecture. A quick explanation of masterless is that all nodes are peers. So if any node goes down or fails, another node can automatically subsune its workload. This masterless architecture provides true fault tolerance for data and it can do this at massive scale and distribution. Macy's, for example, is one that we like to add a lot. It has over seven years of zero downtime and this includes through upgrades and maintenance. So that's pretty impressive, but with that, there is also more we can talk about when it comes to enterprise-grade strength of Cassandra. It was designed for linear scale and that means that with the addition of new nodes, there is a predictable amount of performance that you'll see as an increase. And in other words, there is no performance degradation with the addition of new nodes or a scaling out. The project is over 10 years old and it has been contributed to by the community, including hundreds of big brands. You'll see some of those in the bottom corner there on the left. And it's really been battle tested by hundreds and thousands of enterprise, massive scale deployments, companies like Netflix and FedEx, Apple, China Mobile and so on. In today's world, we think you need to manage your data on an underlying infrastructure that is also purpose built for data portability. So that includes being able to scale and not only be deployed but scale across on prime, hybrid, multi-cloud and even inter-cloud deployments for those that are looking at multiple cloud providers. Cassandra does this in a unique way and it empowers you to build and run modern applications that scale across. I like to say scale across because you can really go across different environments across data centers on prime and clouds and so on. So who is data, data stacks, our roots are in open source and new sequel beginning with the Apache Cassandra project. We are the new sequel leader in resilience, resilient high performance data management solutions. We have built a high performance data platform that's powered by Apache Cassandra and that is a data stacks enterprise with our expertise in data management. We also provide a lot for the communities and for developers and I'll share some links at the end. We host conferences and events, meetups. One of our biggest content assets on our website is actually the digital docs for Cassandra and we run an academy for online courses and boot camps and so on. So moving on, these are some of the things we see all the time from our customers that contribute to data diversity and complexity. So I'm guessing many of these are not new to this crowd, things like legacy data that you need to integrate real time or streaming requirements, disparate data or siloed data, data security, unpredictable scale, massive scale with the apps today and then scaling across the clouds. With this modern data and the shapes and types and diversity, we really see that the workloads of today really require more than just your traditional or single workload capable data platform. And what I think is important to note here is the data management evolution. So beginning with relational on the left, moving to NoSQL and then even graph becoming more dominant in today's world where people are looking at the shapes of their data and trying to match it with a data platform. So these are some of the challenges that contribute to the mixed or advanced workload requirement that we see. I'm not gonna go through all of these in details. Assuming we'll share the slides here but things like how to adjust data for different types of workloads and data models. So you need data model flexibility, API flexibility. When you're talking about mixed workloads, you need to think about how all of these things will apply to the different types and shapes that you model in. So intelligent indexing, connected data and so on. So we're gonna go through an example here of a mixed workload. And some of the things to keep in mind are how complex is your query? Is it simple or is it somewhere in between? Is it really complex and how fast do you need it? So I love this slide. We'll go through it fairly quickly but the first being find me Dave. So there are like millions of days in the database and that's bubble number one under CQL. So that's a simple query and look up and in Cassandra's case, it's lightning fast. It's designed to be fast in that kind of query. And what you'll see as you go through the different queries is a higher level complexity in the query and also a different response time. So you see with one application zoning in on a customer 360 type application where you want a holistic view of your customer, the relationships and their behavior. You'll see that you might have maybe a mix, maybe one to two, maybe even three types of workloads that you would need to manage. In our case and data stacks enterprise our implementation actually fully integrates those different components, search graphs, analytics and stream processing. So you can get to all of those types of queries with one database. In our case, that's our recommendation to cover mix workloads and really complex data. So I think what this brings is really blended workloads that you can get to with a database that can handle different types of data. So again, just to summarize, simplifying data complexity requires, we think a single data platform, one that can handle mixed workloads, one that can handle real time requirements and one that can scale across data centers and different platforms. And so here, just a reference, a couple of links, online courses, if you wanna learn more about Cassandra or Tinkapop and data stacks enterprise. Thank you and over to you, Donna. Jen, thank you so much for this great presentation and kicking us off. If you have questions for Jen and questions about data stacks, you can submit it in the bottom right-hand corner of your screen in the Q&A section. And she will be joining Donna in the Q&A at the end of the presentation today. So now let me introduce to you the speaker of the series, Donna Burbank. Donna is a recognized industry expert in information management. Oh, with over 20 years of experience helping organizations enrich their business opportunities through data and information. She is currently the managing director of Global Data Strategy Limited where she assists organizations around the globe in driving value from their data. And with that, I will give controls to Donna here to get the webinar started for her presentation. Hello and welcome. Hi, thanks. It's always a pleasure to do these and kick off a new year with data architecture. So thanks to all who have joined. Shannon's already given me a introduction and I think a lot of you know me but I've also run a consulting firm that does information management and I've also been in the industry for, seems like God, God, God blowed the years and active with things like Dame International which I know a lot of you are involved with as well and have done some product development with some of the vendors on the market. So if you're not aware, this is a regular series on data architecture. This year's lineup hopefully fittingly starts with kind of some of the emerging trends and sort of as we go into the new calendar year, what is the next big thing? What are the things we should be thinking of? Jennifer gave us some great ideas of some of the new technologies that are out there and that can sort of be overwhelming as we look and we're trying to do our day jobs which is why Dataversities is a great resource. Give them a little plug. But we hope you can join us for some of the others. You'll see that there's kind of a wide ranging lineup each month. Next month is on Data Strategy. We will cover the gamut from cloud warehousing to master data management, et cetera. So you can register all of these and Shannon generally says this as well but they're all on demand after and you will get the recording if you can't miss it live because I know how things go. We all have great intentions of making it live for the great chat but then work comes up. So thanks for you who are able to pull away and join. So hopefully I can see you on some of these others. What we're gonna cover today is something a little special. We gave a bit of a sneak preview last month also with data stacks. On this Dataversity Report that we did, I did together with Data Diversity on some trends in data management and there'll be another one coming up in a few months this year. So this information really came from you the collective view as a community and these are more and more valuable the more folks chime in. So I will give you some of the findings. You can also download this both from the Data Diversity website as well as the Global Data Strategy website if you wanna see the actual report and we'll give you some kind of highlights and some findings from this. Because I think we all do this day to day in our own office. It's nice to kind of step back and see what others are saying and see what kind of the trends in the industry are going in. So because we are in information management and metadata management, we all love definitions, right? So how do we start a whole discussion on data management without some core definitions? I already referenced at Daemon, the Data Management Association. They have their body of knowledge and of course they have a definition for data management and you'll see there that it's this idea of managing and developing your information assets, controlling and protecting them and also throughout their life cycles which I think is important and a lot of us think of sort of data in the moment but for those of us who are in the industry understand that there's a whole life cycle around creating, maintaining and retention periods and deleting data or offloading data. What I thought also was interesting and this was what's great you are not a shy collective crowd in data management. I know the chat is always very active on these sessions which is great. Also I try to catch them after the fact but we get a lot of good just anecdotal input from these surveys that isn't always just yes or no. So I thought some of the survey responses were particularly interesting and we sort of called out a few. This idea of the people, processes and technology around data as an asset and I'll cover that a lot more in this presentation because the more I am a super techie nerd and that's a great thing. I always say it's a good thing but the more I'm in the business I realize that it's the people in the process around that technology that really makes this thing. We've all been on that project that you think and you might be very right that you've built this great application and you stayed up till midnight to do it but no one sort of cared or some other project that wasn't even as technically good sort of got the funding in the buy-in. So why did that happen? Or maybe you built it and it was hot for a few months and then everyone sort of forgot about it and it wasn't maintained. So that's why that people process part is so important to data as you manage the life cycle. Second one touches on that as well it's an organizational capability supported by tool processes and standards and people of course. But I think sometimes we techie folks love to start with the tools in the tech but really a techist support the business. And I thought that last one touches on that as well that data management really makes the business activities more effective through efficient data management. And we'll talk a lot about that in this particular presentation because that is hot. So if you've joined these sessions before you've probably seen this framework this is our framework at Global Data Strategy that we use for our practice. We've gotten some great feedback so we keep using it but it's sort of similar to the DMBOK a bit but it's sort of augmented by our focus on the business particularly with our practice and some gray hairs and scars from doing this for many years of kind of what matters. So we always stare at that level one which is that top down which I mentioned of what's the business strategy. Again when Jennifer talked about all those different use cases of are we just finding Dave or are we trying to get the social media around Dave or are we trying to do real-time data streaming of what he's listening to on Netflix. That might not be important if you're a bank. So that last one. So really understanding what the use cases for data is so important and how more importantly how data can drive business strategy. So it's sort of that little alignment piece in the middle really highlights that it's bi-directional so and I'll talk more about that as well. Yes of course, don't do anything in tech unless it's aligned with the business strategy but more and more and why I'm still in the business honestly for those of you who have kind of a business focus or interest in kind of innovation and entrepreneurship, often it's the data that drives new business ideas think of Uber, Netflix, those are data companies that happen to do other things. So that's what's kind of fun and exciting about being in data right now but there's also that level five that also gets a lot of us really interested in being in the business of databases and that's just not those little cylinders you see there kind of look like relational databases and we'll talk more about that in this presentation. They are still the leader, they're not going to go away anytime soon but there's other options as well. There's big data platforms, there's streaming, there's documents, we often forget that when we sort of focus on data that there's documents out there too. I know that came up with one of my clients that we were talking about data security and privacy I think it was GDPR and they said, well, we don't have to worry about PDFs and documents, we're the data management group and I said, well, if you steal someone's credit card and their identity theft has been out in the market and you say, well, it's okay, it was not a PDF. I don't think that'll matter, it's still PII, right? It's still their information and so information comes in a lot of format so you need to look holistically across documents, databases, everything. And then as you move up the stack how do you integrate that data? They are in disparate sources so how do you manage a PDF for the database with a XML stream? And then most importantly, I think, near and dear, my heart is metadata management which is not only the technical integration but also the business focus, right? What does the data mean? What's the context around that? And level three is maybe not very elegantly I just say it's what you do with data. What's the stuff around that adds value? So we'll talk a lot in this presentation because it's hot is still business intelligence, analytics, how you get insights from data but we who are in the business know you can't do that without the things on the right. Things like data quality, data architecture, modeling and making sure the way that data is stored is correct. Is it a data lake and or is it a data warehouse? Do we have core domains mastered? Do we have that single view of Dave? Do we understand that there's a single view of customer, of product, of invoice, of patient, right? And then data governance is that part that we already touched on with just the definition of data management kind of the people layer. And yes, there's technology for governance and we could probably wax poetic and argue all day as we do and in architecture we love to have some good debates. You know, is architecture governance? Yes, is data quality governance? Yes, but I think highlighting it at this level is the people and the process and more importantly the culture. Does everybody in this data-driven business understand that their role, their accountability and their ownership of data? And then how do you have the policies and processes in place? And more and more, I'll talk about this as well. The idea of data processes and data governance processes and just plain old business processes are sort of merging and melding. And when data governance is done right it kind of becomes that business as usual activity you don't necessarily think you're doing data governance just as you're doing your job. Data happens to be a part of that. So we'll kind of cover a lot of these things when we go through the survey because it sort of touches on a lot of the areas of the industry. One of the key things that popped out of the survey and I'm going to kind of interject and interweave things I rediscovered and heard in the survey with things I see day to day in my practice and I think a lot of times they overlap. A couple times we've seen some things that weren't in the survey that I'll highlight but this is one that we're seeing everywhere and I'm sure you are too. This idea of the data-driven business or I'll talk more about this too in the day-driven organization it isn't always a retail business. What was heartening is that in the survey 75% of the folks said that they see data as a strategic asset. And we've been sort of shiming from the rooftops for years in the industry about that but I think more and more I mean most of my engagements in our practice come from the business sponsors. Sometimes from IT but more and more it's business people that are saying hey I get this data strategic asset. Not exactly sure how to get there but I know I need to. When you look at the drivers for that a lot of it is the traditional yet always important things like how can we be more efficient? How can we save costs and increase efficiency? Which you could say yeah the same old same old but I think that's also heartening in that people are seeing that data management and you could have asked me this 20 years ago and I would have told you. Data management helps increase efficiency. Sometimes folks may see the price tag of oh it's gonna cost me this much to put a data quality process in place or a master data management hub in place but the long-term effect as we know can be much more beneficial over time or because what folks are doing when you start looking people are spinning right? They're doing it in spreadsheets or they're cleaning up or I was talking to a client just this morning the poor finance analyst that was doing the report he had been up literally had been up all night trying to get the numbers to match and it was this boss who said can we do this in a warehouse? Something more efficient because it is once you get it right people can actually do their day jobs and not spend time trying to make the data right. What's sort of a newer trend and you'll see the numbers aren't quite as high but I know in our practice this is actually one of the leading things we do so we see that number probably more in the 80% of the folks that come to us might be just the selective who tends to come to us but digital transformation and I know that's a buzzword but most buzzwords when I sort of counter some of the negativity types buzzwords I know was it in the 90s when they said the e-commerce, the e-business right? Everything's dot com and it's in the dot com bubble burst. It's like yeah we don't have any dot coms anymore that amazon.com was just a fact right? It's not that it went away some of the unsuccessful ones didn't say but it sort of became business as usual who even thinks that dot com is a big deal anymore? Of course it's dot com it's brick and mortar that's actually kind of the outlier right? So we're seeing that with digital transformation it's a big word but half my life is on a cell phone if I can't sort of do my banking or buy a ticket for a show or do anything on my phone or my computer I almost don't do it so that is digital transformation and as we know that's driven by data you can't do that exciting stuff without the foundation of a data platform or a data management foundation. So I do wanna drill into that a little bit because you saw let me just go back one for a moment you saw there are almost two ends of the spectrum in a way how do we just save costs and increase efficiency that sort of business as usual just do what we do better and then we're talking digital transformation which is sort of the other end of the spectrum and I do think there's a slight difference I mean one could argue that we've been doing data-driven business forever and of course we have many companies have and you think of insurance I think of some of the actuaries out there saying data science we've been doing that since we had companies they are insurance almost is a data-driven company it's how you manage risk and understand your book of business through data right? So the idea that you can be more efficient and reduce redundancy and eliminate manual effort by data and digital that we've been doing that for a long time as well as how do I have better marketing campaigns with click through rates how can we understand how we're using our product? I kind of put that into you're optimizing your business how do we do what we do already better but what I do think is different and even that's not so different anymore is this idea of becoming a data company where you're really transforming the business or coming up with a new business model entirely which is that idea if you're a data person and you're also an entrepreneur great time to be alive right now where data is the product or monetizing a product and this could be several things if you've been on these before you may have heard me talk of we had a big energy client in the UK that really just decided that doing home heating oil was sort of a declining business as people are trying to use less energy so how can they monetize the data they have and use they really has sort of an app-driven data-driven application where people can control their energy costs and see analytics on their own energy usage and data was their new hot product so there's either companies transforming themselves realizing that kind of that exhaust of data that comes out of our business can be monetized and used and used for a different purpose and or maybe there's entirely new business models think of the Uber right that is a data company that's using data in a very creative way to be a new business or you could even argue Amazon right is a data company that use data really, really well and so I think that's what's kind of exciting as we look at different data management opportunities. I thought this was interesting to show this is a article from or a report from the World Economic Forum and what they're saying actually had some metrics around this is that in the old days everything was brick and mortar you actually sold stuff and business was driven by a product focus so if you look in the old days a whole seven years ago where the big companies in the market were things that sold things Walmart, right Exxon, mobile energy. So you look now and it's rather than a product focus it's more of a data focus so Alphabet, Google, right I mentioned Amazon, Microsoft and you could argue that some of these are digital slash data but they're so interlinked and their finding was that for the first time or in recent history if data is more valuable than the actual products are being sold Amazon would be a great example of that I don't think they're the leader because they sell necessarily the best widgets the widgets in the best way through data so I thought that was an interesting kind of highlight that even the World Economic Forum is saying hey, data's the way to go everyone needs to be in some way data driven either your company is data or your company runs on data in a more efficient way I found this interesting and this is from our practice a kind of fairly accurate infographic bubble chart because I've been in the industry probably going in 20 to 30 years now somewhere in the middle there and all of us who have been in data management for a long time have probably worked in finance and or insurance and or governments or some of the big player healthcare some of the big players that have gotten the value of data management for a whole long time what's fun for me and why my job is so cool someday I feel is that more and more organizations are understanding data management and the value of it and to be fair I think a lot of the tools are just a lot more consumable the stuff the data stack from the beginning mentioned I mean that was the fact that that's fairly available to the average organization now at a reasonable price I mean with light years ahead of what anyone could do maybe just five years ago right so the fact that this is ubiquitous and there's a lot of open sources a lot of opportunity gets more people in the market so we have folks from small nonprofits so we have a head start program doing data we worked with a small museum in the Midwest we have many of them here now in Latin America did a car manufacturer it's probably not new for them but they are looking at new things like internet of things so the diversity of types of information companies and that's why when I kind of stuck about the data-driven business some of my nonprofit clients you'll see that's a big chunk of our business not necessarily by design but I will talk more about that this kind of push for data towards data for good they often remind me it's not a data-driven business it's a data-driven organization or a data-driven mission or a data-driven culture which I find kind of an interesting spin I put that similarly with the universities and education we seem to be working with a lot of those as well and it's again a very different model than trying to sell more widgets or maximize profits you're trying to maximize people and mission and value which is kind of a different spin which is kind of fun the other key finding we saw in this which kind of ties into the data-driven business is the idea of having better insights around your business which is in some ways you could say the traditional business intelligence and analytics that's not like we have not been doing that before in the past it's been around for quite a long time but it's still valuable it's kind of like saying .coms went away we still have .coms we're just maximizing it so you'll see here that 80% of respondents not that they were using it I think if they were using BI it's probably closer to higher than that I won't venture I guess but that it was one of their key drivers for data management so when we said what is the main thing you're working on with data management it was overwhelmingly BI and analytics and we put both of those together we didn't separate kind of reporting from kind of advanced analytics and as you know they are slightly different but we did sort of break that out that 87% were doing BI or your traditional sort of reporting 87% also had a data warehouse and when you look at kind of that idea of is it big data analytics is it warehousing you'll see that about 22% in this case we're using both a data lake and a warehouse and I kind of picked that statistic I think that's the more normal use case if I could venture that I know I think there was a bit of a high cycle with big data and data lakes and I have so many rants as I live in the business more as one of the I know I'm getting old when half my sentence is start with don't get me started but there was a sort of a period where data lake was the thing and everything else was sort of you didn't need that anymore and I think a lot of us knew that was never realistic because there's this and condition right you can have a data warehouse which is great and you can have a data lake which is great and I think more the more advanced organizations I work with have both and are using both in conjunction with each other to have the best case scenario and we'll talk more about that of one of the other reasons it's so exciting to be in data nowadays is there are so many choices and this is where things like data diversity can help kind of demystify with some of those right choices so be I analytics not going away anytime soon as a huge driver for business insights is a little more insight into this this was that survey where folks said what are your main goals for implementing data management in general so you'll see that analytics and reporting were huge that idea of saving cost reducing risk driving revenue and digital transformation we talked about already reducing risk I thought was an interesting one because that's going to tie into another thing which is near and dear to my heart which is data governance which will be something else we'll talk a lot about because it came up a lot and that's a driver so data governance it's one of those things that I was pleased the other day I was at a client and I heard a 20 something young professionals say oh is that buzzword of data governance and I just think wow we've made it now data governance is a buzzword it used to be sort of anathema that you know you don't mention the governance that's something people think of like brushing your teeth that you have to do but it's not very exciting I'm saying the opposite I think it's the opposite for several reasons people are realizing that to make all of this sexy stuff work like AI and like digital transformation you need your data right data versus data I think it's just yesterday came out with a kind of a trends report on data governance I'll give them a little plug and I had put some comments in on that article and in that article I think I gave the anecdote that I've actually had some kind of venture capital startups doing AI come to me and saying do we need could your company give us some training in data governance because we don't want to invest in this company that's doing AI without a data governance foundation and I said okay now for me that is we've made it because I always think of venture capital and AI is sort of the move fast and break things culture and even the move fast break things culture is realizing you have to govern to get the AI right I think that's a great sign so people are realizing if I'm going to be data driven I need governance I need data security which was even higher and in my color commentary I think folks get security almost before governance that just seems a little more visceral you know it don't feel like my data governance is a little more nuanced it's about ownership and traceability and things like that the other one I put in there I'm very passionate about when I've talked in the past you might have heard me talk about governance and I always kind of summarize it with the carrot and the stick the stick is you know manage your data make sure the data quality is good don't do this don't do that and this is a part of that doesn't motivate most people but the idea of collaborating better and working together as a team and work making sure the data is improved so you can do data driven business and collaboration and AI tends to get people that stuff about this and then the rest comes so most of our data governance implementations sort of start there and it's what is our core you know we've all done these what are your core principles what are the top 10 principles of data governance and that can be dry and boring and seem academic but if you get them right it's you know there was one I've talked about before we were at a hospital and it was you know improving the lives of children in crisis that were in intensive care when there was an argument about a master data management field people actually brought that up guys were here to help children not die in the helicopter as they're getting medivac we need this data field and so that was an extreme example because it was still mission driven but I think what we found when you get that core principle we're trying to do X with data we're trying to you know maximize profit or we're trying to understand the customer or whatever people want to make sure the data's right and then they do data governance because they want to not because they have to so that seemed kind of preachy but we just see it work all the time you know get the hearts and minds and the rest of it follows but if you're just trying to force rules it definitely doesn't happen and because data is so hot I'm just seeing that happen a lot more easily you know people are asking you know we often start some of our strategy engagements with interviews and we have business people asking for governance can we please have some more data governance I'm sick of arguing around KPIs can't we agree and just put it in the glossary that's great because people have felt the pain right so I found this interesting when we're talking about collaboration who drives data management and organization one of the things I thought was heartening and you'll see here just to note to be data correct it said select all but apply we didn't say just pick one and I think the positive of this is that people are working together you'll see business stakeholders analytics architects CEOs and CIOs driving this which fits with what I've been seeing and I think that's great because if any one of those groups does it alone it's not going to work if business goes off and does shadow IT it's not going to work because IT is a thing you have to have some skills if IT goes off and does shadow business and doesn't listen to business that's not going to work either so you really need all of that but one of them and this is probably a gap in our survey when people sort of did their own what's the other most people that type to data governance lead and I think that actually makes a lot of sense and when we say you know when you hire someone for data governance who is that person we often say that's the champion that someone's going to get hearts and minds they can do data project management that they can also really lead and be a data leader so that really fit well and that was a good call and moving on from that I think this seems realistic to me too is that most people now I mean if unless you live under a rock and you're a business person you've seen data you have seen it in Forbes you've seen it in Harvard Business Review I mean data is the new oil all of those statements we've all heard right but I think where people break down is but I don't exactly know how to do that because that's not my you know wheelhouse I haven't been trained in data so I think people are at the stage I know it's an asset but if you look at some of the other findings not quite there yet in terms of data quality maybe formal metrics aren't being managed communication isn't still quite there if you can just think of the different personality types from the CEO and the CIO you know not stereotype but you know business folks tend to be move fast break things right and your CIO is probably going to be kind of more you know cautious because they have to be things are going to break and they're going to be held accountable right so I think this is probably also healthy that people are realizing that that one's kind of hard to read we say do you trust the data quality of our assets sort of low which means data quality isn't good it's kind of a negative there double negative and people don't which could be seen as a bad but I think it's a good that they're actually paying attention right if people weren't looking at the reports it'd be sure whatever looks good to me so I think the more self-critical is probably a sign that people are paying attention so I guess is how you read the data data story telling so this last one was not in the survey but I put this in because I keep hearing it and I found that really interesting it's probably been about the past six months where I've gotten the clients and they have sort of brought up or even I've shown them the framework and they've said where's ethics you have governance but what's the ethics and I thought that was a sort of an aha moment for me and these are not always the nonprofits in the data for good type companies sometimes they're your traditional bank or retailer or insurance company or manufacturing company was one of them the other day that was talking about ethics and I think a lot of us you know we all are both consumers and products really in a way if we're on Facebook or if we're on Google we're a data product right so I think we can feel it a little bit that idea of I can do that but should I do it that's one side in the other side I am and we do work with a lot of nonprofits that really this idea of and you can Google it this tons out there data for good and I think some of it is people who've been in the industry for a long time and have done may have been profitable through data and they say and I'm going to give back how can we use data to this is a if you're not familiar with them the United Nations has a global sustainability goal so many companies have sort of compensated on you know can we reduce costs and increase revenue and that's not a bad thing but these global goals are are we helping health of our community and alleviating poverty and equality and you know peace and justice and all those things that we probably should be doing but sort of get lost and we're looking at just a profit so that's a whole initiative that I am just hearing and seeing more of and I thought it was worth mentioning I think also not only are people mentioning explicitly ethics as part of data governance but I've noticed I'm a big fan of customer journey mapping because and then kind of doing data overlays on that because it really helps you hone in on how use that data and I've heard some sort of what do you call it skepticism isn't the word I was looking for but around customer journey mapping that it seems like folks are just trying to make more money up the customer and maybe that's true but I've seen more of a explicitly empathetic customer journey mapping what it was you know one set heads who has what's frustrating our customers we walk through in the customer's shoes with this feel creepy we work with one university and that was actually part of it what does it feel to be a student and how does data support that so I'm seeing that actually used for good as well of more of an empathetic we're getting data from our customers how does it feel or can does our customer support process you know into people want to work with a chat bot with AI or where they have rather for human being helping me use data to make customer journey more pleasant so I've just seen that coming up and I generally a fan of that so I thought it was worth kind of mentioning because it's just something we've seen a lot okay so I talked a lot about the people and the process and the governance which is important but what is also important is the technology right we can we can have a great governance but if there's no take the platform then it was probably not worth having the governance so what is still true and I still think it's a good thing because they are good solutions 81% are still using relational databases on premises so if you actually as in in the cloud as well it's even higher probably close to well it's hard to say but the way we calculate the data but definitely the 90% probably close to 100 that most organizations have some sort of either cloud-based or relational database what sort of keeps me up at night is that you'll also see that 71% are still admitting to using spreadsheets as a data platform so it's probably higher for the folks that don't want to admit that they're using this spreadsheet to the data platform now that said I'm a big fan of spreadsheets I use them myself I just wouldn't use them as a data platform for kind of ad hoc analysis or actually doing financials on them that's good but they really shouldn't be a data platform and we'll talk more about this that the data relational databases are fine but that's not the only thing for dinner anymore and there's a lot of other options like cloud with the cloud base which is still relational but graph no SQL big data there's just a lot of different use cases as Jennifer touched on in the beginning of this whole presentation so here's the data that we got kind of those highlights from that you'll see far and above both kind of the two big pillars there relational databases and cloud based relational databases spreadsheets and a lot of other but a lot of other is definitely small what I also thought was interesting is if you look towards the bottom legacy platforms are still fairly high I mean they're higher than graph databases they're higher than in and of the things is that a lot of companies a lot of big banks a lot of financial institutions still have COBOL mainframe one could be a bit snarky and say well they're still working so don't knock them I wouldn't say do any new development on a mainframe or maybe you could argue with me in the chat but what I when you look ahead that what I find interesting is relational databases more a larger percent are moving to the cloud they're not going away and there's still a high percentage I mean they are very good for what they do I mean they were designed with you know relational algebra is a thing and it helps with data quality and it helps with traceability and I find a lot of clients who aren't even at that level of maturity the number of clients we should have go into databases that are sort of used like spreadsheet that really don't have any referential integrity or some of the great things that relational databases are good for there's still a lot of maturity right there that could happen but what makes me feel very positive is that you'll see those lines kind of spread out everything else and that feels about right to me because I don't think this idea that anything in which was my rant earlier on the call relational databases aren't the best thing they're one of the options data lakes aren't the best thing either is graph and I know sometimes vendors can try to say we're the only thing but they're just wrong not that I'm biased right there's a tool for every job is like saying a fork is better than a spoon well if I'm drinking soup then I don't want to use a fork and that should be obvious but I think when we get into tech we're sort of that fear of missing out am I in the wrong platform and it is stressful you'll see towards the bottom there's a high percentage that admit they don't know and it's not because you're dumb there's a lot of options out there so I think this idea for a medley or an ecosystem of tools rather than a tool makes a lot of sense so to go back to Jennifer's use case that I really like the do I want to find I forgot his name Dave Bob I think it was Dave do I want to find Dave do I want to find Dave's friends do I want to see the transactions around Dave do I want to see social media around it those all have a different use case you know graph could be the social connections relation of a master data hub can be find a single day et cetera et cetera so realizing that mapping of those use cases is a great way to start what was interesting we kind of went into future technologies was something we didn't really talk a lot a lot of the folks that kind of we maybe we shouldn't have put it in the other category this idea of containers and Kubernetes they were kind of the same thing similar that you know can I can I especially with this complexity kind of box things out in a container to make them more portable not a bad idea the future plans sort of sync with a lot of what I'm seeing as well this idea of moving if we can think of bi is just your bread and butter of I need to make data-driven decisions based on you know common KPIs we'll still be doing that for a long time that's still critical and it's still complex but to move past into more AI deep learning use cases is interesting to a lot of people I think a lot of people realize their data has to be good to do that the next two so this fit into a particular category so I found that interesting industry 4.0 actually a decline on here right now and then Latin America looking at is doing just that we were just a plant yesterday looking at their sensors and how and it was way cool if my nerd can come out a little bit so the amount when you think of that just that is everything that's both new you think of the slide I have is like the new business model on the current business model yes you're being more efficient and you can automate but it really is a new way of looking at the industry digital twin I kind of put in a similar category instead of doing and actually this client is doing that as well or looking into it you know I'm not going to change the actual machine or car or boat that I'm building I can do a twin and kind of do some of the virtual version of that as well kind of test before you do it in the real world which makes a lot of sense so I thought that was kind of interesting which is a little bit different than everything else we had talked about in the survey as we look at kind of key driver I know it's a busy slide but what folks are currently implementing and kind of touched on already BI did a warehousing and then governance and quality around that again when you look at the next few years things looking in kind of these new technologies like semantic web I was a little bit surprised actually that that was so high that was the highest one hearing a lot about it but I think the idea of graph and things like that that are related are really interesting to people virtualization do we have to move the data at all or can we have a data virtualization layer and then some of those others aren't a surprise around data science and analytics and big data as well as self-service more and more business people want to get their hands on the data I mean most many business people are being quite savvy with data now and then to support that you'll need metadata management and governance to make sure that that data is right so that again sort of felt right and good to me that it seemed to align with what I'm hearing at least so we should have promised on this webinar what is the next big thing so I drumroll please I don't think there's any one big thing which you might have already noticed there's a lot of little things that become big but if I had to vote Donna Burbank's top five predictions for 2020 here they go it's put something out there I think this blurring and I just kind of touched on this blurring of quote a business person and an IT person is blurring I think a lot of business people have been in the past very savvy with data are becoming even more so is being taught in universities Bar and Python are just ubiquitous and the tools are becoming so much more user friendly that people are we could call it shadow IT or you can say call it depending on the use case business are becoming more active and understanding data I also think this idea of blurring data management in business will continue when you think of digital transformation data is the business and often sort of corrected by some business people it just seems so weird what is a business person it could be a scientist it could be a teacher it could be a actuary you know to try to tell an actuary they're not a data person you might get a fine look so I think that'll continue to blur you know we're just becoming a more tech savvy you know world number three we kind of talked about that there won't be this one thing that everyone just buys a relational database and puts it on a server that there's a you know a mix of data-centric tools that you know that's why we need to become more savvy that it's a ecosystem that work together it's not necessarily in order but I think this idea of governance and ethics will have a much stronger role which is a good thing I think again is where we are all both products of data and also users of data we have to understand that at a deep level and I think AI and analytics and BI of course will continue to be a strong driver but I think more of a focus on predictive analytics not just descriptive AI is definitely becoming hot I think you know there's probably I think if you think of the Gartner hype cycle I think a lot of people do have that bit of fear of missing out with AI and should I start with AI I don't think it's a fit for everybody of course but I think for those who are it's definitely an exciting time and something that people can sort of move forward to so and just to kind of summarize again this white paper most of the findings here came from that so you can download it at the Data Diversity or Global Data Strategy if you are interested in these there'll be plenty of more throughout the year you can register now or later quick plug for Global Data Strategy we do this for a living so if you need help let us know and I want to give some time for questions so I'm going to pass it back to Shannon who can open up the floor Donna thank you so much for kicking off the new year with yet another fantastic presentation we love it as always and just to answer the most commonly asked questions just a reminder for this webinar I will send a follow-up email to all registrants by end of day Monday with links to the slides and links to the recording and anything else requested throughout and we'll invite Jen to join us as well in the Q&A as we go through now Donna on page 11 what do the bubble sizes represent so to reference for folks I'll make people dizzy because there's a fast way to do this so on page 11 this was supposed to be the relative size so the majority of customers came in finance and insurance small list came in restaurants and then the other ones are kind of indicative of overall number of clients that were for our particular business like I didn't get industry this is just from us of the number you can see the relative size love it so the higher percentage use of spreadsheet is because of evolution of big data processing as customers are confident now that the spreadsheet can now easily be integrated is that true I will get Jennifer's point on that but I'll answer first I don't think so I'm still seeing folks either just using it because it's there and it's quick or sometimes going around then the official KPIs want to do themselves or it shows a lack of integration I think there's a use case for spreadsheets for example if there is a trusted data source and you want to download it into a spreadsheet and kind of slice it and dice it if that's how you're comfortable with that's fine but I think of using it as an official source or a trusted record that makes me nervous so I'm not sure big data would change that I think it still needs to be managed and governed but Jennifer I'll pass that to you in case you have any additional thoughts there yeah and stretch it I think you know it's also dependent upon the size of your data and the processing of the data if you're just storing you know flat data tables and it's not past a million lines or something like that whatever the limit is and you know you could maybe manage but there's a lot of things like data governance security and you know when you have collaboration or more people accessing the data then spreadsheets become much harder the understanding group setting I think that's a great point you touched on last is that collaboration I mean there's some online editing together you know Google Sheets and things but I think that's one of the biggest things you miss is spreadsheet on someone's hard drive that's a little limited for sharing so Donna you mentioned you know you asked each platform data platform and storage has its pros and cons for various use cases and what that in your experience have you seen any papers or studies that documents some of those the different use cases for the different platforms yes I'm sorry well I think Jennifer had some good ones in her presentation some of the upcoming I think a lot of the data first of these stuff does I know last year I did one on graph and talked about some of the use cases on graph and we'll be doing some I think there's a lot of good stuff out there I can't point if just one I guess I guess a lot of you know sometimes it's actually going to the different vendors who is the graph and kind of listening to their use cases versus relational and something that up I don't think there's just one stop stop unfortunately agreed to understood and Ken I have a question for you you know a lot of your use cases were for profit examples do you have any government examples government use cases yeah I'm trying to I'm trying to figure out which ones I can actually talk to that I can talk to like you know we have state and local so federal examples I think the IRS is one that's public there's also one for parks and recreation but that was you know again very massive amounts of data is where Cassandra shines so for the smaller you know data use cases it doesn't mean that you can't use Cassandra especially if you need something mission critical or with high availability but you know it's with the advanced security that we built into the platform that resonates well with the government and then the scaling across different organizations or different services or different units within local government is again a lot of the reasons why people come to Cassandra because we can scale across not only geographies but across data centers and you know sort of join across disparate data sources and in some cases you know across again different platforms so on-prem versus cloud or different clouds so I know that doesn't answer the question directly but you know if you want if somebody wanted to reach out to me separately I'd probably go through some other cases I'm just not 100% sure which ones are confidential no that's a great answer and that's I think that's very helpful and I will of course include the information for data stacks contact in the follow-up email so continuing through here you know what's a digital twin a digital twin is it really is sort of a digital representation of the thing in the real world so one example one of our customers that work was a big oil rig and so you want to you have kind of a I think a CAD you know an auto a CAD diagram where you or you kind of design a house or you design a car or you design in a way that's sort of a phase one digital twin but the digital twin you can actually kind of do analytics you know kind of use case analytics so what if we you know took out this pipe what would happen so you can or I guess I guess they're doing some of that medicine you kind of test it out on the digital twin before you have to go out to the oil rig and actually change that pipe so it's kind of nice you have a literally a kind of digital representation of that physical object that's out there in the real world that are often really complex and what is industry 4.0 so industry 4.0 incorporates a lot of different things but if you think of kind of that next generic I guess you could say industry I probably should know so if you're 1.0 you know forward with automating the the cars you know on the production line but things are much more advanced now so think of sensor data coming from the machine so you know the client that I'm at today everything's done with robotic arms and they're actually building this big machine robotically and then you sort of have sensor data so you can say when does this thing need maintenance I see that this screw either you know through analytics a screw of this type or a pipe of this type may have this material generally wears down at this time but you can actually also look at the data itself so hey this thing is starting to wear out or you know this this particular company has sort of production line statistics and this production line is based on the weather and based on all of the big data sources you can say of hey we need to it might slow down a little bit at this season or maybe you have to do some proactive things and this in this company they need to add some better viscosity of the oil based on the temperature you know so really using data to drive industry and so it's not just automation but it's that kind of next level real-time sensor data and then linking that and I guess it's like the Uber of power plants or whatever you're you're automating that you can have you know think of Uber who can look at the the airplane traffic and the car traffic and consumer demand and kind of put all those together to get a car there for you in five minutes it's kind of the same thing for a plant how can we look at all the different factors and automate it and scale it and kind of use data for disadvantage and speaking of automation and Jen you might want to weigh in here as well you know what will be the role of AI in 2020 and and forward will we have any data-driven AI or knowledge-based AI yeah I can kind of touch on that I'll jump in here I think for AI the amount of data will just grow exponentially right you're collecting so much data so that you can not only learn the patterns but like forecast and project patterns and things like that so the analysis on the data will become really important as well and you know that will demand high levels of performance on masses massive amounts of data so we call that like AI scale when we talk about how you know something like an Apache Cassandra can address those coming use cases but we see that really impacting the data world and data management for sure Dawn anything you want to add to that sure yeah no I think and I might be the big beat that's going to come soon but there we go I think it's going to be sort of what I mentioned things like .com became sort of business as usual and you almost don't notice it anymore I think you know we all think of Siri and Alexa and we think of AI or I do or chat bots and things like that are sort of you know externalized I think what's more exciting actually is the stuff behind the scenes of AI for the plants like I mentioned that can I do predictive maintenance can I or AI in my phone that kind of knows the best route for traffic or based on my past usage or I think it's some of that stuff that just really automates the next predictive level next next next best action type things that maybe aren't the kind of in your face robots but kind of the stuff that's really driving the business side I think is probably more interesting well Donna and Jen thank you so much for today's presentations to our attendees for being so engaged in everything we do but I'm afraid that is all the time we have for today again just a reminder I will send a follow-up email to all registrants by end of day Monday with links to the slides and the recording and I'll get you a link to the reports that Donna was talking about as well and hope you all have a great day thanks so much thanks again today just ask for sponsoring and help making these happen thank you Donna thanks Jen thank you bye everyone