 Here we go. Hello and welcome. My name is Shannon Kemp and I'm the Chief Digital Manager for DataVersity. We'd like to thank you for joining the latest in the Monthly Webinar Series, Data Architecture Strategies with Donna Burbank. Today Donna will discuss building a future state data architecture plan, where to begin, sponsored today by DataStacks. Just a couple of points to get us started. Due to the large number of people that attend these sessions, you will be muted during the webinar. For questions, we will be collecting them by the Q&A section. On our tweet, we encourage you to share highlights or questions via Twitter using hashtag DA Strategies. And we very much encourage you to chat with us or with each other throughout the webinar to do so. Click the chat icon in the bottom middle of your screen to activate that feature. And if you'd like to continue the conversation after the webinar or follow Donna further, you may do so at community.dativersity.net. As always, we will send a follow-up email within two business days containing links to the slides and recording of the session and any additional information requested throughout the webinar. Now let me turn the webinar over to Louisa from DataStacks for a word from our sponsor. Hello and welcome. Hi Shannon, thanks for the answer. I just want to check and make sure you can see my screen. Okay. Looks great. Awesome. All right, so hi everyone. As Shannon mentioned, my name is Louisa B. And I'm the Senior Director of Product Marketing here at DataStacks. As the webinar suggests, I'm here to talk to you about building a future state data architecture plan. And I think the question really is where to begin, right? So before we talk about that, let's quickly discuss how to build. I know a history lesson can be a bit boring, but I will definitely make it quick here. The initial foundation for data architecture was the mainframe. So while several mainframe manufacturers produced mainframe computers for commercial use, starting about the late 1950s, they didn't really take cold until the late 70s and 80s. And this was in large part due to the introduction of a relational database. In the 90s, we then saw the introduction of the client server model, which was really a distributed application structure that supported new things like email and what we used to refer to at the time as the world wide web. And of course now it's all about cloud. But what does this mean for your data architecture? Let's take a look at an example. So here's an example of a data architecture of a big retailer, one of our customers actually. As you can see, the volume and types of data that must be shared across different parts of the business have been significantly over the last 20 years. So for example, think about personalization, one of the workloads up here. How data is leveraged to provide a truly personalized area to customers based on their prior purchases, as well as things like current inventory and search history. You could just imagine the amount of data that's involved in doing things like that. And this is a huge change from the separate point of sale and inventory management applications that we had until recently. Those systems of engagement and of the record, they used to run just fine on our mainframes and relational databases. Now let's talk about your own data architect. Can your legacy systems keep up with the needs of your business? Are they built on mainframes? And are they architected relational databases? And what about the cloud? Do you have a clear path to modernize your infrastructure? The reality is that modern applications, we are now using requirements for modern data management. The foundation for that is modernize your database. That's what I'm here to talk to you about today. So because of changing customer and consumer expectations, that database is required to be all-in-on, so no downtime. It needs to be distributed across multiple geographies and maybe even multiple cloud service providers. And it also needs to provide data in real time. And that's why database architecture matters when it comes to modernizing data management. The reality is that customer expectations have changed dramatically. We need databases that are always available and can quickly scale based on demand, no matter how big that demand is. We also need databases with no single points of failure, with data locations called tolerance, things like global data distribution. And of course with cloud comes a lot more options when it comes to deploying your application. So it's also important to have the flexibility to deploy your applications and the database they are running on where you need them, whether it's on-premises or of course across multiple cloud service providers. Think about how good cloud models are, for example. Finally, how you database can have a big impact on your budget. So profitability comes with big factors, especially when we're talking about the cloud. So these design principles are why a new generation of databases have emerged, whether you refer to them as SQL, distributed, non-relational, or by some other term. They are all built with modern businesses and modern applications in mind. Actually Cassandra was designed to address all of these architectural considerations. In case you're not familiar with it, Cassandra is an open source project. It was built by Facebook. It's a Power Facebook's inbox search feature and it's therefore architected for linear capability for fault tolerance and to run on commodity hardware in a cloud infrastructure. Data facts, the company I'm here representing today, we are the number one contributor to Cassandra. We've actually contributed more than 70% of the code commits to date. All of our products are developed and updated from the open source Cassandra project. And what that really means is that we're able to provide our users with the number one database for scale, of time, and performance. Really building that future database. Back to the theme of this webinar about designing a future state, data architecture, and where to begin, I think that Cassandra is a great place to start. It enables you to build it right, build it once, and then scale as your business scales. And that's because it's masterless or you may think of it as peerless. It's a really ring architecture that ensures continuous availability across multiple zones and regions with latency that we actually measure in milliseconds. So you no longer have to watch that spinning dial as you wait for these loads. Now because there's no master-slave relationship in Cassandra, there's no single point of failure. So to ensure that data is available even if a node fails or are unreachable, Cassandra data is automatically replicated across multiple nodes and even across multiple cloud service providers to ensure constant data access. And again, that's a no single point of failure. So when we're talking about high availability, we're talking about withstanding the failure of an entire data center. That's really critical in today's modern business environment. And this is why more than 60% of data taxes customers are deployed in the public cloud. This is actually a data point from last, early this year, I would expect it to be higher now. And this masterless architecture is also what allows the database to scale linearly without compromising a performance. So whether you're talking about five nodes or a thousand nodes that all scales linearly with predictable performance. It also goes beyond core Cassandra to provide support for mobile workloads. And all of this is according to one unified security model. So for example, we have a native graph database. Those capabilities really enable you to identify and analyze those hidden relationships between connected data. The Spark analytic capabilities provides real-time analytics. I'm sure many of you are familiar with Spark model and then we also provide enterprise search functionality as well as in memory and in performance. You can dig into more details of our data management capabilities on our website of course. Now let's talk about what this means with a real-world example. Let's talk about Macy's. Macy's needed a flawless data management platform to power its own channel catalog, especially during the holiday business, the busy holiday season. So you can imagine right now with a lot of stress on their system and their data management platform. What this meant to Macy's is that in order to stay competitive, we really needed to provide a positive and engaging customer experience across all their channels in order to attract and retain customers both online and in the store. So to accomplish this, we adopted a multi-cloud strategy. They did that with IBM initially and leverage data stacks for its own channel catalog service. The result of all that was the API for Macy's channel catalog now scales up to millions of universal products and a million requests per second with a millisecond response time. What's the millions in there? You get it, right? It's a big deal. We were also able to support Macy's in providing a seamless online and mobile customer experience both on-premises and across their multi-cloud infrastructure. So we now at this point have five years with Macy's with zero downtime. It's a big deal for a big retailer like this, especially during the holiday season. But recently they did a GCP or Google Cloud to their multi-cloud infrastructure started first with TaskDev and then very quickly moved to production because they were already running on data stacks from a data management perspective. This was incredibly easy for them. We were actually in the room with their lead architect and due on the board how they simply have to add the data center, change the replication factor in the schema and then the data stacks database automatically begins replicating. This was something that the architect never thought was possible. There were literally really jaws dropped in the room, which is really cool to see. So that's the end of my presentation. If you're interested in learning more about Cassandra or data stacks you can do so for free on DataStacks Academy. We have a tons of educational content available. Just have to sign up. There's no application for a pay-to-sample or anything like that, completely free. And the URL there in case you can't use the QR code is academy.datastack.com. Thank you all for your time and attention. With that, I'm going to hand you back to Shannon. Louise, thank you so much. This is a great presentation. We've got questions coming in for you. So if you have additional questions for Louise she will be joining us in the Q&A section at the end of the presentation today. But let me turn it over now to Donna Burbank. She is a recognized industry expert on information management. It's over 20 years of experience helping organizations enrich their business opportunities throughout data and information. She is currently the Managing Director of Global Data Strategy Limited where she assists organizations around the globe in driving value from their data. And with that, I will turn it over to our series speaker Donna Burbank to get us started. Hello and welcome. Hello there. Always a pleasure to join you guys. This should be a good session wrapping it up for the year. For those of you who have joined in the past, thank you. And the question that always comes up in the Q&A is are these recorded? And yes they are. So if this is your first time joining us or you missed any of the other ones throughout the year these are all available at www.dataversity.net in perpetuity, I believe. So do try to catch some of the other ones you might have missed. Also the theme of this webinar is sort of planning ahead for next year. So on that note we do have a full lineup for next year. We also are continuing this series. So just to plan your calendars ahead for some of the topics we'll be covering including things like cloud blaze data warehousing which fits into what Louise was just talking about. So do try to catch us again next year if you're able. So delving right into the topic what we're going to cover today is actually based on a report survey we did or I did with a dataversity earlier this year on trends in data management. Hopefully you'll find this interesting not only on talking about some of the trends but it's always nice to have actual data and metrics behind this. So I think you'll find it interesting. I certainly did. If you are interested in seeing the actual live survey itself it's two places. So it is out on the dataversity website of course and we also have it on the global data strategy website under our white paper section. So either of those places you can go get a copy of this and see the actual report yourself. So what we were trying to understand in this report and what we'll also cover in this webinar for firstly is how we even define data management right where an industry that loves to have you know we work on definitions and we love to try to argue about what definitions of things mean. So we're sort of the covellers children have those shoes we're often bad at kind of defining these terms but I'd like to just kind of start with what we even mean by data management today's new environment. What are the hot technologies to adopt and what's the passing fad or a trend and it can be very confusing in the market. There are so many options there and as Louise mentioned you know in the 70s maybe it was easy we had mainframe or you have mainframe right but now I mean it's just amazing it's a great time to be in data management but it can be overwhelming as well. So not only just kind of tools and technologies which tend to be fatty and trendy but how can we actually start to go the actual data architect to support my business goal and why are we doing this and how do we make this a bit sticky so it isn't just you know the tool of the hour but a architecture that can grow with my business. So hopefully you'll find this interesting as we go through some of the findings and the report. So again if we want to start with a definition of data management what better place to go than the data Dama Diembach or the data management body of knowledge and if you look at their definition I like it in a way in that it covers a lot of the areas we're thinking about. It's not only how you deliver data management what kind of platforms also control and protect it but then the idea of how do we enhance it how do we actually use data and information assets to help our business thrive and we're trying to do this for a reason. So that's sort of a textbook or the Diembach book definition of data management. What I also thought was interesting in the survey and hopefully we'll be doing these yearly and you'll have a chance to chime in yourself on future reports but we take a look at the comments as well and there's always some great input from survey respondents so I took some of the answers that people could have typed in and I thought they were very interesting so a lot of folks that have touched on the idea that yes data management by title is data and technology but really it's the people in the process in technology as well. The second one there you'll talk about organization capability that yes it's supported by tools but it's also processes standards and importantly people and we'll talk more about that in this session. The last one I like because it sort of touches on what you mentioned earlier is that data management, the value of it is making your data effective so that you can actually support your business activities. I think data is fun. I think a lot of people on the call think it's fun but we're not doing this for fun. We're doing this for a reason to actually help the business and the organization which leads us to the framework that we use at Global Data Strategy and as our name implies we do data strategies globally. So we do this for a living literally and what we always start with is just what I mentioned is how does data support your business strategy. If you don't get that right it's not worth doing anything else. A lot of things we could do, we all have other things that we'd like to do in our lives as well. So let's focus on the highest value activities and really understanding how you can use data to really enhance your business. So many companies I'm working with now are actually using the tagline that we're a data-driven business. Can we be the next Uber? Can we be the next Facebook? Can we be the next thing, company that we haven't thought of that can leverage data in a unique way? And if you're the type of data person that's wears a lot of hats and they kind of have a business slant or a pension for data this is a great time to be in data because that gear up at the top that has alignment really shows that it's bi-directional. It isn't always just the business saying oh, data people go do this. Often it's data people coming to the business saying, hey, we could do this. Just basically looking at the data. So I think that's some of the great opportunity that's out there. When we talk about things like cloud for Louise's conversation, the amount of I'm going to try to say a word again. Democratization of data really gives a lot of opportunity to anybody out there who can actually access a massive amount of data and a lot of the great technology out there to actually do some great things with technology. So we look from the top down at the business but then we also need to look bottom up if you look down at that level 5 of what are we using in the organization and maybe more excitingly what could we use? And we'll talk about that in terms of the platform opportunity that is out there. Your on-premises are in the cloud. So are we using all mainframe? And we can scoff at mainframe but there's a whole bit out there that's still working pretty well. Someone did a good job way back in the day. That said, I don't see too many new implementations of starting out fresh looking at mainframes. So what are the different opportunities? Does what we have as an organization align with our business goals and how do we get there? And anything is possible, right? Anyone can run a marathon. You may never run before but you can do the training to get there. So we've worked with organizations. We had one company that wanted to be a completely digital, completely online company and they literally I didn't think these companies existed but literally were using paper processes and they sort of had the invoicing lady who would kind of carry the paper over to another desk, right? Didn't mean they can't get there and they did but it's really you have to take that realistic look of what are we using for technology and what we can do with it. And then we sort of move up the stack to how do we get that inventory of our data sources? How do we integrate them? Because there will be, I guarantee you, disparate data sources across your organization from ERP systems to relational databases to cloud to mainframe to no SQL, right? So how do we really get that integration in an effective way? And then how do we have the metadata management to really make that integration effective? Do we know what the data means? Do we know what the lineage of that data is, etc? And then moving up the stack again, how do we really make sense of that with the holistic architecture with whether it's a data model or a data architecture or you know remastering the information? Is there a warehouse and or an analytics hub, etc, etc, etc. What's the quality of data? So data is complex and it isn't all about the platform and it's not all about the business needs. It's how you put all that together in sort of the magical way with the architectural structure to have it make sense. And then as folks wrote in in the definition of data management, data is managed as a verb and there has to be people doing actively doing that verb. So that's where Data Governors comes in. It's the people and the process and the policy and probably most importantly the culture. And so as I mentioned we do a lot of these in our company Global Data Strategy and we do some maturity assessments and we often are lucky enough to kind of grow with our customers and be there several years in a row and we like to kind of do that maturity assessment year after year. Often the fastest thing to fix is the technology. Get a new platform in place. We can get new tools but getting the people to come along with that and have a data driven culture where everyone understands what that means that's often where the lag is. So governance is not something you can skip. You'll see that that line is strategically placed between business strategy and implementation below it. That's really where that glue kind of melds together. We have a company that's doing business processing and we have technology and it's the people and the processes that really make that thing. You can't skip the time and space. It's really critical to everything we'll be talking about. So moving along into some of the survey results and our findings what I thought was interesting to kind of look at is what are a lot of pieces of that kind of framework mentioned of data management and everyone defines data management slightly differently but what are you using today and then what's your vision for the future. So it wasn't surprising to me to be curious to your thoughts in the chat or in the questions but a huge focus on business intelligence reporting data warehousing. I see those as different. One is the reporting layer and one is the database layer behind it. So many companies are going to be or pushing to be data driven and a lot of that is data driven decision making of do I have the right analytics either prescriptive or descriptive or predictive depending on what level you're at but really trying to use data and analytics to understand your business. So that particularly wasn't a surprise to me. It was great to see the high focus on data security because I think that kind of goes with privacy and governance as well. What was also interesting is looking ahead huge jump and this is something that we didn't expect and we hadn't seen in previous surveys is the large shift towards semantic web technologies or even if you think of sort of kind of your graph database model and I think Louis touched on that a little bit in the introduction of you know we're trying to just sort of discover hidden patterns in data or make connections between data that sort of makes a lot of sense. Similarly that the high data virtualization count there as people have disparate sources of information kind of that idea of leaving data in place and creating that virtualization layer on top can be very appealing. Again these are all tools in the toolbox. I have a lot of ramps and pet peeves some of you have been lucky enough to hear them but it's one of them is one of vendor comes in and says we have data virtualization and therefore data warehousing is gone or we have data warehousing now and that means you know any operational reporting is gone etc. So these are all valid technologies in their place and they all kind of tool from your toolkit and we'll get more into that. And then you'll see here the third and fourth or fifth actually are our surprise either we're looking at a lot of companies looking at analytics that idea of data science, AI machine learning big data, self-service analytics and then of course which is nice to hear that when people are looking at analytics and big data they're also taking metadata management and data governance because that really is completing the picture you could have great reports but if you don't know what the data means or who created that data they're not going to be effective or they may even be risky to your business so kind of refreshing to see that as well. So what I also found interesting and kind of talks into the beginning of not only what people are doing but the why so what are your business drivers so again analytics and reporting top the list I think in a way that can be frustrating to some of us in the business when people think data they think reports they think analytics that is one very valid use case but it's not the only one and you'll see some others there as well that have been probably true since the dawn of time when it comes to data can we use data to save cost and be more efficient yes can we use data and governance to reduce risk yes can we improve customer satisfaction I mean a lot of these are common which is great to see them that they're kind of continuing needs I think that idea of digital transformation is another one I'm seeing in our practice it sort of goes along with some of that customer satisfaction I mean what is the customer need a lot of this digital how do we translate that customer journey from on-prem or brick and mortar to kind of the digital workspace so what's refreshing to see is that people do see that data link to it I think early in the ages of digital transformation I know I saw in some of our customers it was oh digital means web it means kind of the front end I think as people matured they realize really digital transformation is the data you can't do any of it without the data that's really the foundation of your digital transformation so you know be curious your thoughts I don't think those were surprising but they were refreshing I think you know it kind of parallels a lot what I'm seeing in the industry as well if we move ahead in terms of you know what we were just sort of putting statements out there what do you think is accurate and this laundry list of things that we hear all the time in terms of data management I think very refreshing was that do you see data as an essential asset to your business that was the spike you see there at the bottom almost everybody and again maybe we have a biased audience is with a dataversity survey but so of course folks taking the surveys to the head data on the brain but it was refreshing to see that the majority of organizations really do see data as an asset I think where it breaks down and I see this as well in my practice is that yep I know that data is an asset I know that's important but what does that actually mean and how do we implement it and when the rubber makes the role how do I get that to work is sort of where things are kind of low early ranked there so some of the smaller bars you do all stakeholders understand their part in data management and I see that a lot as well and we're trying to implement things like data governance yep data quality need it go fix it well everybody has a role in data quality if you're putting the data in or you're managing the business process that enters the data that that's not something that's IT can fix or a software can fix part of that ties into communication you know and that's communication that's hard anywhere in the world right people are complicated but especially when we're talking about high tech stuff like data it sort of adds that layer of complexity of how do we take very technical terms and explain that in a way that everyone across the organization understands and I mentioned quality that's continually challenged because that's you know something that always has to be work on data is a business you know the part and parcel of the business and the business continues to evolve so data quality isn't something you do once and put a tool in and fix it's something that needs to be managed on a daily basis and part of that is that last bullet is getting those metrics in place of what does good look like what are our targets does everybody in the organization know what our data quality targets are we might know our financial KPI targets do we know our data quality KPI targets I know we all kind of marching to that same people and or IT people we also be trying to get a certain threshold for data quality for key data assets so it kind of leaves you I've been talking a lot or at least alluding to this idea that you know data isn't just an IT thing or data analytics thing it's a business thing so we asked okay when we're talking about data management who is driving that who would say to be the leader when we're talking about data management some of those should not be a surprise when we have folks like the data architect leading data management I would hope so that is not necessarily surprising or data analytics or chief information officer or chief data officer I think the high spike in business stakeholders I found interesting now we did not this was a if you notice up top looking at the data carefully it's a select all that apply so it may make me nervous if say a business stakeholder was running the only person running the data management issue because it is a technical thing but by the same token I would be nervous if only the IT manager were running that right so what was interesting when we looked at that other the other sort of spike at the bottom was that data governance lead was probably the most common other which leads me to believe that normally when one puts into a data governance counsellor steering committee or a team or group all of those stakeholders that are listed there are involved so this was a multiple choice I'm hoping and then probably a strong inclination given that I've worked with a lot of organizations that do this sort of thing it's probably a team effort which is a great trend to see that none of this should be done in the vacuum that when you're building a data management organization it should involve the business stakeholders and the analytics team and the CIO etc etc so I see that as a positive trend and governance sort of came up throughout the survey sprinkled in even when we didn't ask which is a great tendency those of us in the business for a long time are probably refreshed to see so many people finally understanding that governance is that glue that holds all of this tech and data management together so we can't forget I've talked a lot about people in governance and process and all of that but at the end of the day we are putting data on a platform and so technology is critical and it's what keeps a lot of us in the business because it's kind of fun so when we look at that part of this keeps me up at night and makes me twitch the point at the bottom is that spreadsheets keeps coming up year after year as one of the leading data sources or platforms now one could read this in several ways if one is in a positive polyanna sort of glass half full could just mean that a lot of data the business people are looking at data and what a business people look at as spreadsheets and it doesn't necessarily did say source or platform but it could be that it could be an export right maybe I take it from the master data or the warehouse and I put it in the spreadsheet and I do stuff with it that makes me less nervous than I am managing my master data in a spreadsheet or my quote warehouse spreadsheet and I have seen all of that I think three or four companies we work with this year had the the Mary spreadsheet or the Joe spreadsheet or the Michael spreadsheet that really was the location master or the employee master list seriously important data to the organization that was starting the spreadsheet so I hope and pray and not encourage that it will happen anytime soon but that spreadsheet line as we do this year over year will kind of keep getting smaller because it should be a consuming mechanism or an analytic you know if you're doing some financials great place to do that in the spreadsheet but really a poor choice for a master data hub the other one that continues to spike is a release the relational database it was true past few years we've done this and you'll see that at least today the majority are still on premises but you'll see that the other larger spike there is also kind of cloud based relational databases and if you permit me a bit of a rant ideas with one of the rents I already did but when folks come in with a new technology you'll see the others there whether it's graph or IOT or semantic or XML or JSON it sort of means that relational databases are going away I mean I don't see that happening anytime soon I think the data isn't supporting that and the technology isn't supporting that what relational databases do they do very well in fact a lot of the customers I work with are using relational databases but perhaps more like a spreadsheet and so I think there's some untapped potential and you know folks that don't have these or don't have you know their normal form or some of the things that really enhance some of the data quality as folks look to automate and they automate things like governance relational databases were kind of built around that premise to how do we you know keep things consistent across systems so I don't see them going away yet I see the ecosystem changing which sort of leads to that next slide is future right so you'll see and there are a few more people to use in the future it went down a little bit but they're still there but you'll see the relational still up top still a clear winner but you'll see the trend of less on premises or equally on premises with clouds you'll see clouds are growing actually I correct myself you don't see on premises decreasing actually you see it being equal with clouds so it doesn't necessarily mean one is better than the other but we'll talk about that in a bit they both have their place but you'll still see that that is sort of the work for the workhorse of the organization though of relational databases what you also see is a more even distribution of the technologies which I find refreshing so if we go back to the previous slide of what people are using today there's spikes the relational databases and your spreadsheets and then people are kind of dipping their toe in the water with these other technologies which is fine because these are new and it's great to dip your toe before you jump into the deep end right but what makes me heartened by this is you don't necessarily see a spike to everybody jumping on the Hadoop platform or the Graph Database platform because I see these as fit for purpose solutions right there is not a one size fit all the great thing is that now in the database world we have a Swiss army knife right you don't just have to use your old stick knife or butter knife you have a lot of tools I read this as people are realizing that and they're using the tools in the toolkit as they see fit which is probably a fairly even distribution like this right so that Hadoop has a great use case Graph has a great use case relational databases have a great use case use them as they were designed and I think people will see much more success so I see this as heartening and I see this also it's fun because that's why we're in the business I mean the number of tools that are out there and the number that you can literally spin up in the cloud and just play with from your living room is amazing I was telling the story to a friend yesterday I have a friend that works for NCAR in Boulder, Colorado that does a lot of atmospheric research and his dad did too and he said son as published the amount of power you have in the cloud and on your laptop we would have killed for and the amount of open data that's available from governments and research agencies that you can literally spin up in your living room stuff that we would have died for we had a big mainframe in Wyoming with the whole building that kind of did this analysis for us so it is a great time to be in data management and you have a lot of different tools in the toolkit so I thought this was interesting if you know there's probably technologies here that may be new to you and it's a great time and data versus a great platform to kind of learn about some of these new technologies like semantic technologies or graph databases etc because there is a lot of cool and fun stuff out there cloud must come up because there is a trend coming up so we should have asked two things what are the pros and what are the cons so you'll see there the highest pro that people listed was better scalability and I just talked to it that literally you can spin up something on AWS or whatever on the cloud at a very inexpensive cost and get things done very quickly also like Louise mentioned the Macy's example a lot of organizations have seasonality of their demand and so maybe you need a lot of bandwidth around Christmas and then in March nobody's thinking about you and so you don't want to have to buy a bunch of hardware just for that one month etc so there's a lot of flexibility there I think the the con that folks mentioned was that idea of privacy and security and that's heartening that people are thinking of that I think people have to be realistic and not all cloud providers should the same and clearly people are using them it's not all risk but I'm sure you guys know this and have heard this before but it's kind of a sometimes a nice way to think of it as the cloud isn't the cloud it's somebody else's machine so when you think of it that way be careful what you put and make sure that there's your contracts are done accordingly and that they have the proper SLAs they have had customers with cloud providers that it goes down and it's bad when your own server goes down but at least you have some control over that when somebody else's server goes down it's a scary feeling so anyway those are kind of the as reported kind of pros and cons that people saw what I found interesting when you kind of put them side by side you'll see that some of the pros were also a con you'll see what is the reason for moving to the cloud lower costs why don't you move to the cloud okay they're both right really you need to think of your use case and I'm actually glad this came up because it's a sort of you know is it more or less expensive to rent or buy a home there is no one answer everybody has a different you know lifestyle or financial model there's some pros and cons to each but just think carefully it is a different cost arrangement you know whether it's capex or op-ex it was one big decision it is a different world I had one client who I mentioned it it's so easy to spin up kind of test databases and they were used to sort of the on premises I think it was SQL server that they were using in the past and sort of moved to an on cloud and it would spin up these servers and then forget to shut them off and you're still paying for that it's not like you buy the hardware and it's done so a part of that was just educating their team and their cost ended up being incredibly higher in the cloud and it didn't mean the cloud was bad it was how they used the cloud so again think of that carefully there is not one answer it makes sure it matches your use case back to that is it cheaper to rent or buy a home think of that for you there's no one answer similarly with better or lower performance there's different use cases that is not all or nothing so really do some tests on your data think of the variability on the different platforms of your you know usage throughout the year etc so I thought that was kind of interesting that what what's one person's hell is another person's heaven right you really think of it none of this is a one size fits all solution and I would be remiss if I only talked about technology or you only talk about governance after all this is an architecture strategies webinar and what can be overwhelming is that the fact that these tools and technologies do change so rapidly and what I find heartening is that there's there are architectural print principles that go across all of these so having a data model of your assets and that should not change as rapidly right I still have customers and products and accounts and invoices and I should be able to have an architecture or even conceptual level a map of my data assets of the organization as I move from these different platforms from on-prem to the cloud and I want to see how that data moves do I have that written somewhere in the data flow diagram or system architecture diagrams and I see so few companies when we go together and kind of do some work together that there is a overarching view of that a lot of folks have sort of detailed individual systems but stepping back and really looking at that big picture so I think as one looks at all of the different choices available and now and in the future kind of stepping back and looking at what does my business want to do what is the data I need and then how is my current state system architecture and data flow working how does that affect my business process it can be very eye-opening and really kind of help focus with all the choices instead of a more technical way of looking at what am I trying to do and what I have today so this was again from the survey what people are actually using in the survey respondents you know highest was a logical data model which I found interesting and I guess the positive of that is that's a business view and one could argue if logical is kind of based on relational and I guess sort of but at the end of the day it should be a platform agnostic business view at a fairly detailed level of your organization at a similar I saw a conceptual model come in second absolutely should be platform obnoxious for a platform agnostic in that it should be this is the these are the data assets we have in this organization again we have products and patients and students and classrooms etc etc that should be your sort of guiding light or whatever your guide as you do these different platforms of what data goes where and even having an overlay on these platforms so found that interesting and just sort of a reminder that no matter what technology you're using please do have kind of that system architecture of you the kind of the bird's eye view and then the data asset view as well so hopefully those findings were interesting again if you want the full report 40-50 pages so you can definitely keep yourself up at night if you can't sleep there's some great findings there but I think more interesting is the somewhat so how do I put this together as I head into 2020 I know a lot of you are as we are in planning mode for what do we do next year what are our priorities so again when we have all of this technological change and options I was complaining some of the other day I travel on as I imagine as a consultant and one of the things that stressed me out that wasn't maybe rational was one of the car rental companies you can now pick your own car they remember one day I had quite a stressful day and I just really I have to choose one more thing tell me what car I get and sometimes I think we can all feel that way you go to the grocery store and there's 700 versions of shampoo anyway sometimes we can feel that way with technology in some ways it was easier when they were mainframe or mainframe so with options come responsibility so sometimes it's nice to have some basic steps or templates or ways, mantras to kind of keep everything in order as we're looking at all of these disparate technologies that go together so this is kind of a thought when we're thinking of how we do we'll put together a data management program and you'll see there's a lot of people things here technology is clearly important but none of that works well unless you have buy-in and support and a team behind you so none of these needs necessarily be followed in order it isn't a one two three but you'll see that up in the upper left I would say one of the more important things is getting that senior executive support having a data champion from the business not necessarily IT I would hope that the CIO or the chief data officer was supportive of data but what about your chief marketing officer or your CEO or your HR etc makes everybody understands the data is an asset what they need to do and then align what you need to do with your company's vision motivations and drivers are you focusing all of this great data on the right things part of what you can do to do that is talk to a lot of different folks we all get into our silos and we all could benefit by getting out of those silos they took a management class once and one of the workshops we did was just find somebody in the room that looks completely different from you in every way and I sort of ended up the guy in the law department that really we had nothing in common but we became fast friends and were able to help each other he knew nothing about tech I knew very little about law he needed tech advice I could go he come to me and vice versa if I need legal advice I can go to him and I think in your organization think of that we're trying to launch a marketing application have I talked with marketing do I understand what they need can marketing help me communicate about our data program etc so I think again a lot of us get in the silos and if you're trying to do a business driven data program go outside your comfort zone is it either across functions or maybe up and down so is it the data entry clerk have you ever talked to that person who actually puts the data in for the data quality have you gone up have you talked to the C level team and said what do they understand where their goals around data so that often can help get that long term vision build the business case of why we're doing this look at the data that's most critical we can do a lot of different things with data but none of us has all the time in the world so how do we focus that for the business drivers and map that to your capabilities and I think also assessing your realistic maturity in some cases you can do a massive scale jump from you know where you are to some of these newer technologies in some cases you may want to take a more measured approach and kind of getting your your ducks in a row before you go too fast a lot of folks say for example want to jump into artificial intelligence that's sort of increased our business a lot in areas not in AI but in areas like data quality and governance because you can't do AI on bad data right so before you want to do some of these new technologies make sure you're ready and then create the great organization whether it's governance or steering committees or data teams etc so you can really deliver those quick wins and show value continually so especially with some of these cloud providers you can do things a lot quicker but you have the glossary behind it and the governance behind it and the architecture behind it to make sure that that quick win will still be winning two years from now right and then communicate communicate communicate and make sure everybody is sort of understanding that vision so that's often one that people forget we're so busy building stuff we sort of forget to tell everyone about it and you know I had a sit marketing myself and I think you know the monitors that was maybe after six times people hearing it people might remember it once so you know you've done this great data quality cleanup or built this new platform the cloud but does everyone know about it and did they hear about it six times because they have other things they need to do so make sure you can use webinars launch alarms, use email etc to really get that word out and as you put together your roadmap it can be overwhelming because there's so many things you need to do you know you need to clean up data quality you know you probably need better governance but if you've just focused on that folks are going to say well where's all the analytics where's the cool stuff I need where's AI so I would say when you're looking at this look at that business value I know this is obvious but sometimes just sort of writing out the obvious could be helpful where are we trying to get we're trying to get an integrated customer view by the end of the first quarter who cares about that those are the people who do I need to market to communicate to get involved in the design marketing sales customer support I forgot about support of course that'd be interesting right get all of your stakeholders together and then do a bit of a mix here and hopefully in that list on the left you're doing some foundational things like lineage and glossary and data design but maybe get some open data in there maybe do some IoT integration maybe try a graph database maybe try so if you focus on the use case and focus on the foundation and also focus on some of the new shiny things it's kind of a nice mix of getting faster to innovation but also not building that innovation on a kind of a faulty foundation that's going to crack later so just kind of think of that to mix it up a bit we can all kind of get it excited on either way either over governing and then having only architecture or over shiny things and never kind of building that architecture beneath it so try to mix it up again as you kind of put together your road map and strategy for next year kind of the you know the standard who what when where why can be really helpful you know why can sometimes be the most valuable one right why are we doing this anyway if we were to pick one thing that management or the entire company could get around what is it are we trying to get better customer retention this year are we trying to lower cost this year whatever often just go to your company's especially if you're a publicly traded go to your annual report what is management saying to the street that are your goals well aligned with those are we offense and we trying to do growth and revenue or defense we're trying to reduce risk is that the whole point of data protection you know is a GDPR worried about and then how do we have keep the eyes right when the who I mentioned that already not who could help champion it who's going to do the work both from the business and the IT and who's going to kind of own this and champion it going going forward how not only how technically which I think we talked about what platforms do we pick you know how do we is it on cloud on cloud on-premise or in the cloud but also how do you organize the teams around things like governance to really help you roll that out and then the what think about that as well that sort of aligned with the why what are we trying to do or better have we're trying to have better customer retention well let's maybe focus on customer data and maybe location of those customers etc but we can't do everything so picking picking a small you'll do a high level conceptual logical enterprise model so you can see the entire scope and then almost you know if you think of that as a those color by numbers right think of your conceptual model of this is my empty slate what are we going to color in and phases we're going to start with product and customer and link those together or products and location and start filling that in over time but aligned with your business drivers and then think what's the best platform to store that in so maybe my customer master data should be in a relational database maybe it should not be in a spreadsheet we all know that but that's probably not a great place for a Hadoop platform or but maybe I'm trying to get some IOT streaming data about call logs from my etc etc so think of what platforms aligned with that and then the when so make sure when we're going to roll it out that's realistic but broken up into kind of manageable chunks that you're showing some quick wins along the way we cleaned up customer data great look at now we have addresses we can send out mailing campaigns great we integrated everything to a warehouse now we have MDM it's just keep communicating and develop a lot of small things quickly rather than waiting for the big bang at the end so in summary as you're building data architecture for next year make sure that you understand this is sort of aligned with business insights you know what seems to be hot in the market today is the idea of recording analytics to get there you not only need a diverse technology landscape but the idea of collaboration across the teams to get it and don't forget as you are shooting for some of these new great technologies that you still need the metadata and the data models and the architecture as well as the governance and the people behind it so I will open up for questions in a moment just quickly this is my company global data strategy we do this for a living if you need help remember that next year we have a full lineup and the diversity will be getting the word out soon how you register if you want to join us again next year and again if you want the paper there's two places to download this from either global data strategy or data or city so without further ado Shannon I will open up for questions Donna thank you so much for another fantastic presentation great as always if you have questions for Donna feel free to submit them in the bottom right hand corner of the screen and to answer the most commonly asked questions just a reminder I will send a follow up email by end of day with links to the slides and links to the recording of the session to all registrants so the first question in here is actually for Luis is Cassandra a DAS database as a service or do you need to host your own instance so you're talking about you do need to host your own instance of that data stack we do have the let's call it self-managed product it's about your product called data stacks enterprise that you would manage yourself deploying the cloud you can also deploy through most of the big market places we also do have in beta product called data stacks Apollo which is our new database as a service offering so we'll be hopefully going to GA with that next year you can try it for free at apollo.datastacks.com I love it and you know everyone's so please subscribe Donna so starting with you please describe the details of the survey when was it conducted and who were the respondents and how many respondents which is certainly written within the paper itself as well Oh it certainly was and I don't have that off hand I think it was it was in the hall it was 250 or 300 folks across most of all continents but Antarctica across Europe, Asia, Africa and North America and it was I think released in September and correct me if I'm wrong Shannon we launched it I think in May or June and then kind of did the analysis over the summer correct yeah the survey went out yeah correct in May and then we launched the paper in September correct yep and you're right it did go to about that many or we had that many survey takers across the globe so it might be helpful for the person that gets the paper so we also kind of break that down by you know what role in the organization and what kind of industry so it's a pretty broad mix across it isn't just financial service through just consulting it's kind of a we were kind of pleased that it was a fairly representative survey of a lot of different organizations yeah agreed yes and if you have questions again feel free to submit them in the bottom right hand corner as everyone's pretty quiet there's some lots of chats going on but not a lot of questions everyone's enjoying holiday food already I think all right I know I haven't I actually left time for questions this time which is rare for me I'm like I know you usually go over you know there's been a lot of comments in here too about data quality and integrating data quality and building it into there so you know that has always that was certainly a hot point within the presentation any additional comments on that from either you or from Louise too that I'd love to hear from you on your take on that in general I will let Louise first and then I will certainly chime in Louise did you want to chat on that I think that's something that definitely comes up with our customers all the time and that's one of the reasons we decided to base our database on the standard because it can support you know data from so many different sources so you're not trying to consolidate it yourself a database a lot of that had me looking for you it's also why we've you know gone ahead and integrated technologies like the graph database I think Donnie you mentioned that is one of the big technologies people are considering the next two years or so and we're certainly seeing a lot of that as well as people are getting more and more insights by being able to correlate a lot of that previously unrelatable data so there's a tremendous amount of content from that standpoint but it's only as good as the quality of your data right it's really important to make sure that technology you're running on is extremely reliable that you're replicating the data you know it's accurate as a result of that so definitely a lot of different theories that come to data quality for sure yeah no and I'm sorry I was just asking if you had anything to add I think I've got a little bit of lag on my end oh yeah I think you were sorry about that but yeah I mean I think that also shows the little kudos to the diversity crowd is that you know we're kind of the folks that get this even more than others but I think across the board both the survey respondents and obviously the chat I think that's becoming more and more obvious and to me it's heartening that maybe because I've come from tool vendor backgrounds a little trigger when everyone's only talking about the technology right and there's only a piece of it but I think it partly because so many more business people are involved now to them that's obvious you know I need to have my report right I don't care that it's on Hadoop or versus you know SQL server I just want the data right and so that's sort of an extra thing for them you know it's fun for us and so I just think that's part of the maturity of the industry now that data you know I know as corny as it is data is an asset and folks are realizing that has to be right and that's what I think as we get more of those stakeholders involved like we saw that chart earlier um I think you know takes a village to raise data and I think people are kind of coming to that so I'm seeing that that more and more people are starting with things like governance and data quality before they implement a thing which I think is great because that's what stays your technology is going to come and go and it's great but it's the people in the process and the data the data itself stays across the system so you know I thought that was heartening I wanted to move on to the next question so if you have it Oh I know the question but I saw some of the comments that people were surprised by the number of logical and conceptual data models you know it's near and dear to my heart um and I think some of the comments are like who are these people who's actually doing that I see that all the time um and I may be biased because we use them in our practice especially conceptual because I almost consider that sort of whiteboard version of your data model I know they can be hard and they can take a long time but it can also kind of be fairly easy to sketch up and do some very critical analysis very quickly and I see that as attributed to more and more business people being involved it seems to me and I know I'm stereotyping often it's the tech folks that don't like to do them I just want to build the stuff and okay I'll begrudgingly reverse engineer a physical model but just let me code um my business people love them because it's that kind of way to demystify what's in the database and really start hashing up the business rule so I see that paralleled with kind of that rise and business users becoming more involved because it's a great way to get them involved is using a conceptual and logical model pedestal I have removed myself from the pedestal I'll stop talking about that just a minute left but I know you have an elevator pitch for this Donna and let me feel free to jump in so is there a business case for data management can we get this question a lot so I know this person is not the only one asking about this business case for data management any good resources nah I think the business case for management really depends on your company so of course there's a business case and papers that have hits on it I talk about a lot of the previous webinars you'll see kind of talk about data strategy and a lot of your business needs but that's something I think you have to of course if you don't do it right you have to do it again but I think aligning that to your company's kind of goals is the best way to answer that it's not necessarily reading a book or even going to the DMBocker, DMR or anything like that, it's kind of tying it back to your company that's what I would say please anything you want to add I think that's pretty much accurate from our standpoint as well it's very much based on the company it's based on the type of workloads that you're dealing with and again I think that goes back to we have to use whiteboarding for these kind of discussions as well as actually figuring out all the different parts of the business of the data strategy is going to impact we find a good place to start oftentimes as a vendor we're approaching what part of the business when in fact it's a data strategy that can impact what part of the organization well Luis it's been such a pleasure to have you today thank you so much for joining us and thanks to DataStacks for sponsoring today and helping make these webinars happen and Donna thank you as always for a great presentation and thanks to our lawyer attendees for being so engaged in everything we do and all the great comments and questions that it is all the time we have for today just a reminder again I will send a follow-up email to all registrants by end of day Thursday with links to the slides and the recording thanks everybody thank you all and I hope you all have a great day thank you