 And welcome. My name is Shannon Kemp, and I'm the Chief Digital Manager of Data Diversity. We'd like to thank you for joining the first installment of the monthly webinar series, Advanced Analytics with William McKnight. Today, William will be discussing and kicking off the brand new series with trends in enterprise advanced analytics. Just a couple of points to get us started. Due to the large number of people that attend these sessions, you will be muted during the webinar. If you'd like to chat with us or with each other, we certainly encourage you to do so. Just click the chat icon in the bottom middle of your screen for that feature. For questions, we will be collecting them via the Q&A in the bottom right-hand corner of your screen. Or if you'd like to tweet, we encourage you to share highlights or questions via Twitter using hashtag ADV analytics. As always, we will send a follow-up email within two business days containing links to the slides, the recording of the session, and additional information requested throughout the webinar. Now, let me introduce to you our speaker for this series, William McKnight. William is the President of McKnight Consulting Group. He takes corporate information and turns it into a bottom line producing asset. He's worked with major companies worldwide, 15 of the global 2000 and many others. McKnight Consulting Group focuses on delivering business value and solving business problems, utilizing proven streamlined approaches on information management. His teams have won several best practice competitions for their implementations. He has been helping companies adopt big data solutions. And with that, I'm so excited to be working with William. We've been talking about working together for many, for quite a while now. So I'm excited to kick off this series with him. William, hello and welcome. Hello, Shannon. And hello, everyone. Thank you, Shannon. And thank you to Dataversity for giving me this forum. I've often wanted a recurring forum for webinars, something that I could string together a little continuity, shall we say. So some common themes and so on. And I finally have that, so I really appreciate that. And also, Shannon remind me to talk to you about the music. I don't know if I have that in the cloud, but I might want to update the music here. Okay. So for everybody else, thank you again for coming. And hopefully we make a productive hour out of this. This is really a foundation piece for the entire series that we're going to have this year. I'm really excited about the series. A lot of the things I'm going to be talking about here today, I'm going to be drilling in on the subsequent webinars in this series. And again, it's going to be the second Thursday of every month at this time. So mark the calendars for that. And without further ado, I will jump on into the content for today. First little bio slides. Shannon already gave my bio. You can find out a lot about me at McKnightCG.com or various other places. I do encourage you to get in touch. And this is what we do, strategy, training, and implementation of the things that I'm going to be talking about here today. I'm also privy to a lot of vendor analyst days types of things and briefings and so on. And so I stay up on both sides. I stay up on the analyst side. I stay up on the user side, the buying side of this. As a matter of fact, I've just been behind a couple of large scale surveys of people like I think yourselves, people who would be attracted to this kind of a webinar, people who have purchased data management solutions in the past year. And I've been able to, in the course of 50 questions, dig into their rationale, their beliefs, the process they go through, and so on. So from both sides of it, the user side and the vendor side, I'm going to bring some ideas here about trends in enterprise advanced analytics. Now, we're not just strictly going to be beholden to this year, 2019, although the idea for today did come from the fact that several of my clients are still developing their 2019 plans, as the case may be. You know, things got busy in December and so on. So this may be relevant to your 2019 plans, but whether you have them locked down or not, keep in mind they're never really locked down. So they must be fluid today. But we're not beholden to 2019. I'm going to be talking about some things that probably won't hit your radar hard until 2020 and beyond. At some point none of us know, but I'm going to look at 2020, maybe 2021 a little bit as well as we go forward here. And there are some foundational pieces that you really need to be putting in place today. But I also understand that I'm going to be talking about trends that you can't just chase trends. You cannot just chase any old trends and hope for the best. You have to pick your winners. And you have to really, in this marketplace, there's a lot going on and there's a lot more that the vendors are talking about that I'm not even going to be talking about here today because I don't believe it hits the plans for even the global 2000 yet. So I'm going to be talking about the things that I believe in, the things that I believe are going to stick based upon my vantage point and hopefully give you some ideas about trends. So why are they important? It's important because beyond the mountain is another mountain. There's another mountain out there. We can't even see it right now from where we are, but I do know we have to get over the one that we are climbing up right now and then we'll look out and then we'll see what's out there. But there will be another mountain and you cannot just jump mountains. You can't jump from this mountain to the mountain in the far range here with the picture. You have to jump to the next one. You have to go down this one actually and get up that next one and they build on one another. And foundational pieces are laid and you go forward in that way. I've heard people talk about skipping trends and catching the next wave and so on. Oh, aren't we lucky we didn't get into Hadoop or some such thing? Not really because whether your platform was absolutely correct for that unstructured analytical data or the data lake as the case may be for what you used Hadoop for, nonetheless, you were using Hadoop for that thing. And so that thing is still relevant. That thing has a business relevance and the platform we might talk about. But no, it's not great to be skipping trends, skipping mountains. And I'm going to try to help you. I want you to think about picking your winners. So every one of you that's in information management, myself included, we have to keep our radars open to messages that are out there and we have to have filters and we have to really think about what we're hearing and pull in to our consciousness, pull into our plans, start trying to implement them in our enterprises. Those things that we think are going to be keepers for building a foundational future. We are in the business of data. So everything we do is really important. And I have a whole presentation on this. I'm so passionate about it. I so believe in this that our information is exploding. It's becoming real-time. It's really what differentiates us from our competitors and our quality impacts everything that we do. And there's really not a strategic business objective that is not completely either nailed or at least supported through the use of information. And I like to talk about this concept of having information under management. And I'm trying to help my clients get more and more information under management. It's not that they don't have data running around or data in loose places or maybe working for one department over there or something like that. Of course they do. Everybody has tons of data, but it's under management. Is it in a leverageable platform? Is it in the very best place to succeed for enterprise purposes? That's what I'm talking about. Getting data into that right vessel, that right platform to succeed. And sometimes it's already in the right platform, but the surroundings, the accoutrements, the process surrounding that is not to a level of maturity that it needs to be at. Speaking of maturity, data maturity is highly correlated to business success. And we've done studies to prove this out. I encourage all of you to have a maturity model and to follow it and know where you are and know what things you need to do to get to that next rung of the ladder, if you will. That next level of maturity. And start working on that because, again, hopefully you believe my concept here that you can't be skipping rungs of the ladder. You can't be skipping mountains or levels of maturity. You've got to climb up one, two, three, four, five. And guess what? Today the things that make a level, any level, say level three, that might be a level two a year from now. It just keeps changing like that. And I know that you're not going to just get a budget to flat out and go out and, you know, increase data maturity. The key here, the leadership here that's required is to do that while at the same time continuing to deliver business wins and to do it in a scalable fashion. But most companies are not, you know, very high in the maturity modeling and our studies have proven that most of companies out there are still at a fairly low level of maturity. So all the things that you hear from the vendor community, a lot of that is kind of out there for a lot of enterprises. That doesn't make it right though. That doesn't make it right for you. You still need to be saying, I don't really care about what the other enterprises are doing. I care about what the possibilities are and how to mobilize the energy that we have in this company towards the very best things for us, both short-term and long-term. That's really what we're about. And there is a standard. And that standard is level three in my view. And I'm not, this isn't the maturity talk. I'm not going to expound on this tremendously here. I will have a maturity talk later in this series. But there is a standard that we all need to be at with data. And there may be some other maturity models out there for other things that you may or may not really need to be at some kind of standard for. It's just nice to know. But data is not nice to know. Data is the foundation of our business. So it's important to be at a minimum level here of maturity. And again, I'm going to let you for today anyway define that as you need to. But we need to be moving up. Maturity modeling helps because it gives you a sense of priority. You can't skip levels. I've said that a few times. Maturity levels tend to move in harmony. So it's not just simply data maturity. That breaks down into data strategy, technology, maybe your processes, maybe your architecture. However you want to break it down, that's a good way by the way, those three or four things. But the point is they're going to move up in harmony. You can't be completely great on let's say data governance but you're a zero on data architecture and technology. It doesn't work that way. You will not be. Things will move in harmony. So that thing that you do not like to pursue as a company in the area of data, maybe you don't like data governance as a company. Nobody seems to like it. You look around the office. Nobody seems to like doing data governance. Well somebody's going to need to because that's an element that brings up everything else. So look at it that way. Mid-sized and smaller companies, a lot of the things I'm going to be talking about here today, I'm going to say when I do maturity modeling, you can add one because you're not going to be able to embrace all the things that I'm talking about here. But you guys in the enterprise, you guys that call yourselves a true enterprise, the upper-mid market, I'm going to say, if you're following that category or obviously above that, all these things apply. And there's no easy way out. And what we really need are champions and leaders out there that are going to pick up the helm and find ways to get this stuff in the enterprise and get it moving towards bottom line business success. Momentum is paramount. So what you're at a one today. I don't know necessarily that anybody is so far gone. At least I haven't encountered any client that's so far gone in terms of their data maturity so far behind that there are completely lost cause. But I will say that quite a few of them, the pants are on fire. And we need to get moving very fast right away and missteps are going to be fatal at this point to that company's success. Hopefully you're not one of them, but the pants are just going to remain on fire as we go forward here in this data realm because so much is happening and so much needs to happen. So this is my funny little slide here about one thing that we're here to do and I want to remind us before I get into the trends. Raise the foundation of your company. Raise it up. We're here to make sure that the things that we're doing today with X amount of effort, we can do in the future with less than X amount of effort. If you keep throwing people at the problem, you keep throwing hardware at the problem, you are basically on a treadmill not getting anywhere. We need to be thinking and doing outside of the box. And we need to be embracing the principles that I'm going to be talking about here today. We need some of them, if not most of them, to be successful as an organization. So keep that in mind that it's not good enough to sit still. That's the point there, really. And I know as I get into this, the money tree doesn't exist. We can't just go and say, oh, you know, William or some person said we need to be doing this or that. We have money for that. No, it doesn't work that way. It doesn't work that way for me, that's for sure. You know, we justify our efforts and that's okay. That's expected in business. That's what I try to do. I try to hitch my architecture and my maturity efforts to an application budget. Yeah, a real application budget. Not one that is for, let's say, a data warehouse. A data warehouse is not an initiative, by the way. Big data is not an initiative. If you are treating those things like initiatives at your company, you need to be relabeling here because ultimately what the business needs is the bottom line impact of what those things do. And these are enablers to that. That's all they are. So nobody needs a data warehouse per se. They need the things that data warehousing does for them. And I really believe a data warehouse, for example, I'll get into this, but it enables so many things in the organization that the organization needs that sometimes you can be overwhelmed by all the possibilities within a lot of the things I'm going to be talking about here today. But you've got to hitch your efforts to a budget that does exist or that you create, and I encourage you to create budgets. As a matter of fact, that's the point of my next slide here. As data leaders, yes, of course, we are judged on the success of our users, making users satisfied. Now, don't look at this and go, well, we don't have central IT anymore, so this isn't really us. I don't care. We still have this thing called an IT professional. And I'm going to get into this a little bit later as well. A technology professional that has to do things that enable business leaders to succeed in business processes and functions to succeed. That's what I'm talking about. Wherever you sit in the organization, as a data professional, you are measured on user satisfaction, of course. And then, for many of us, we think that's it. We think that we keep other people happy. That's it. We're good. We're golden. But I'm here to say, being a data professional is not being in the internal staffing business. We're not simply throwing bodies at initiatives that everybody else in the company seems to come up with. We need to be coming up with initiatives as well because only we have the foresight into what the possibilities are out there. At least I hope that's the case. I hope you're looking out. You're here today. That's part of it, right? Getting your education and continually thinking about these things that are happening, these trends in this marketplace. Yeah, that's what's going to create business ROI and growth and, ultimately, more data maturity, which only is good for the longer-term user satisfaction and business ROI. And, of course, there's other things that we're probably measured on as well as data professionals. But I wanted to add this, whether it's explicit in your, what you call it, your employee review form, document, etc., whatever, it's there. It's there. And these are the things that set you apart as leaders, is coming up with the initiatives and creating the budget, creating the excitement over new things for the business that maybe only you can do. Okay, whoops. Now let's get into the top trends in enterprise analytics for 2019 and beyond. And I've belabored the and beyond bit here a bit as well. These are all the things that need to be on your radar now, whether they hit and enforce this year or not. So let's start with data warehousing and the like, okay? So I have an access for you that I've come up with that is how I think about things. On the left-hand side, we have data cultivation on the y-axis. So this is level of refinement. This is the relative level of refinement that we have put our data through to be in that, to be in that analytic platform. The data warehouse has some highly cultivated data in it. I think we can agree. I hope we can agree. And I hope your data, I hope you think about your data warehouse this way. I know some of them are data dumping grounds and maybe I or somebody else wouldn't necessarily call that a great data warehouse or whatever, but data warehouses are supposed to have highly cultivated data. And as the builder, not the user, the data management professional needs to understand how that data is going to be used. Because haven't we had the conversation in data a hundred or a thousand times about how far we need to take the data to get the users to use it? Yeah, I certainly have. If the point and click of a particular item is not extremely clear to that user, then that's my fault. And I haven't done what I need to do. I have to really, really cultivate that data and bring it to a high standard. Okay, that's the data warehouse. That's what we've lived with for the past 20 years. And what about this other thing called the data lake that's emerging? And oh, by the way, get your term straight internally. I have had conversations with enterprises that go on and on about what they need, and then at the end of it, they'll say, yeah, we need a data lake to do all this. And what they really described is what we've been describing for 20 years, which is a data warehouse. Okay, a data lake is different. I know it's new and modern terminology and so forth, but it's different from the data warehouse. Now, some would say, well, the data lake, it's just become a data swamp and nobody's using it and blah, blah, blah. It's going through its early stage growing pains. There's no doubt about it, but I'm a believer in this idea that there is another level here that we need a different platform for. Another level of lower data cultivation and lower usage understanding by the builders. And so that's a data lake to me. So I'm not a data scientist, although some would try to call me that. They don't really understand our industry, but I'm building, right? And most people in this industry are building, and then there are the scientists, and the scientists are the ones that are going to be using the data in the data lake. I must admit, I don't fully understand everything that a data scientist is doing to the data lake that I'm part of building, and that's okay. That's okay, as you can see from the axis. My cultivation is low. My usage understanding is low. We still got to build it. It's not the data warehouse where we bake in derivations and summaries and all sorts of aggregates, and we make it all nice and nice. The data lake is going to be a lot of data, a lot more data than maybe goes on to the data warehouse. It's going to be big data as well as the non-big data. It's going to be all kinds of data types, eventually, when done well. And it's going to be bigger than the data warehouse in most enterprises. And there's another level, right here I'll call it, the data mart, which is all over the place. We have to have some level of cultivation of that data. We have to have some level of usage understanding, but maybe not to the level of the data warehouse because it might be a more narrow audience for the data mart. So my point about the trend is that we are entering a period where we are starting to understand this and we're having more sensible divisions of analytic platforms. And all of them are relevant and this is my chance to say, hey, I think the data lake is great as well, so that is an important trend. Speaking of the data lake, where are we putting data lakes? Well, the first data lakes and the first analytical places for big data was obviously in Hadoop. I believed in Hadoop. I wrote a book on Hadoop. I've been part of implementing Hadoop in many enterprises and still am, really. Hadoop has become HDFS. There's many components to it, but its definition is really kind of boiled back down to good old HDFS. But what my surveys are telling me, what my clients are telling me, is that they're much predisposed now to cloud storage over HDFS when it comes to the data lake, that they have not provisioned or the one that they have may be in a different data management platform. So cloud storage is more scalable, more persistent and available and less expensive. And that's really key here. Again, we're talking about data scientists. We're not talking about it. If we went into data lakes with the data warehouse mentality, maybe that's where we went wrong. Thinking that there's going to be hundreds of users and thousands of users and they're going to need highly cultivated data and so on. No, that's still the data warehouse, by the way. In the data warehouse, we're going to want to put that on relational database technology, not in cloud storage or HDFS. And the data lake is increasingly more in cloud storage, but you should understand the trade-offs here because some data lakes, it may still make sense to be in Hadoop, to be in HDFS. HDFS has better query performance. Again, does that matter based on what I've said? Oftentimes the answer is no. HDFS has storage formats, Parquet and Orc, storage, if you will, can't be used on cloud storage. Eventually that will change and cloud storage has some limits within it today that force us to deal with them in an awkward way, I'll say, when it comes to putting the data in and so forth. But it can be dealt with, and my clients are increasingly saying, we want lots and lots of data in there and we like the less expensive nature of it and we can't see having more than this handful of scientists in there. So what they're telling me is that they don't need the extra, I'll say extra formatting and so forth that does happen with HDFS. Obviously that is far less than a database and a database is still obviously a place for a database or two or ten or a hundred in the enterprise. Another trend speaking of clouds, cloud storage, multi-cloud is becoming the norm. And again, I'm a little surprised at this, but increasingly companies are actually putting down a criteria of when they select a data management platform they want to be able to go in different places. This obviously hurts some of the competitors out there and it helps some other of the competitors out there in the data management space, right? But disparate data-related objectives are difficult to pursue with agility in a single cloud strategy. It's almost like we need multiple clouds. We need our private cloud. We need multiple of the public clouds. Maybe we need a software as a service provider cloud that provides some unique capabilities. So increasingly we're getting to the point where there are a handful of cloud providers. I will say that there are three really and we know what they are that are handling most of the infrastructure as a service and the platform as a service. And that's of course AWS, Microsoft, and Google. And Kubernetes is emerging and a lot of organizations are really beefing up their skills in this in order to take advantage of multiple clouds. It's the foundation of this new generation of cloud-native big data, a container orchestration system that allows us to have our applications across multiple clouds. So this has become a really strong trend and it's time to break away from big data for a moment and talk about small data and talk about small data that's very important. And it's time to declare 2019 the year of something and I'll start by declaring it the year of master data management. Yeah, master data management. It's been around a while, but it hasn't really hit stride until I would say last year and now it's taking off because why? A couple reasons. One is because information is increasingly being viewed as an important enterprise asset. And if that's the case, we want to take care of it in the very best way possible. And MDM is a very good way, very best way I would say to take care of a certain a slice of data in the enterprise. And again, I said it was small data, not large data. High quality, but not high quantity data. The other trend is the fact that we are leaving multiple data platforms in place throughout the enterprise. And I've alluded to this already quite a bit. Databases, cloud storage, some Hadoop maybe, but many, many databases, many, many cloud storages, multiple clouds and so on and so forth. We just can't afford to have our master data mastered everywhere. Or we are incredibly inefficient as an organization. And you know what? In the bottom line, I've done so many justifications for MDM over the years. And I like to do it on TCO because that's the easiest way to do it. I can show the differences between a current state and a future state in terms of cost of ownership of that environment. Okay, that's all great. But in the back of my mind, the real benefit from MDM comes from the fact that you will finally have all the data in one place. And with that comes power, with that comes the ability to spin up so many applications that take advantage of that fact and you can drill in deep with your customers, with your product and so on. You can go places you never would if you don't have it mastered. So I know I've belabored this slide a little bit here, but I did want to impress upon you. I know many of you are already thinking of out there, obviously. You're not going to be alone if that's you. MDM is going to be big this year and going into the future. It's part of the future data management platform. It's part of maturity. Something else is part of that that I won't quite declare at the year of, but I will say it's very important, data virtualization. And whether this is using a standalone tool like a Denoto or using those data virtualization capabilities within your databases and your applications, one way or the other, data virtualization has become something you cannot avoid in an enterprise that's accepting. If you accept the fact that data is important and data is an asset of the organization, you're going to have it in multiple great environments, such as the ones I've already mentioned and more. And if that's the case, well, you're going to need virtualization because you cannot put all data everywhere. You cannot put all data everywhere. You have to make decisions about where it's going to be put but that should not lock you into a place where, well, I have some data here, I have some data there. It's going to be awkward to try to pull them together. It's going to take too long and forget it. You have to be able to pull that report, that analytic, that process step, whatever the case may be together from the data across your environment and data virtualization is really the answer to that. So begin to embrace it if you haven't already. That's going to be important. Okay. Oops, going the wrong way. 2019 is going to be the year of something else as well. I'm going to declare the graph. I didn't quite say graph database, but I said the graph. Yeah, the graph. There's a lot of graph capabilities within databases now that we should be taking advantage of and certainly there is places in our organization for graph databases today. It's not just for the nice pretty interface that it provides us with all of the nodes where we can drill in and we can see who's connecting to and what not. That's all great and that's really a strong part of the value proposition, but what I like about it is the speed with which you can find connections and the relative importance that you can find out about the nodes on the network, in a graph database. In a graph database. What a graph database does is it stores entities and relationships or entities or nodes, relationships or edges. We know about this, right? Nodes and edges have properties queries to reverse the graph. Nodes can be homogenous or heterogeneous so they can all be people like in the case of maybe the Twitter graph who's connected to who and so on, the LinkedIn graph, etc. or they can be heterogeneous and these heterogeneous nodes is kind of where I'm going in a lot of cases today, helping clients find better performing ways of doing lookups because no matter how many levels are involved in the graph, you seem to get some good consistent performance and you don't have to use self-referencing tables and other source of relational tricks to get to some of the same results that we now are discovering that we need to get to with our data. So we have to understand the relative importance of nodes because we have limited capacity as an organization to do service, build product, all the things that we need to do as an organization. We have to deploy it strategically and graph databases really help you understand what the important things are for you to focus on in an enterprise. That's what we're learning. So 2019 organizations are going to embrace this, find a way to embrace this in your organization. Do not keep doing what you've been doing. If you haven't been changing in the past decade as an organization, as a person, it is certainly time now to do that because if you haven't, let's back up a step here. As a data professional, and I counsel many data professionals out there sort of personally, I want you to improve your skills 20% every year. That means in the course of about four years or so, you're going to be 100% turned over because if you haven't, all you're doing is you're building up technical debt personally and one day this is all going to come to fruition and you're going to need to change 100% and you're not going to be able to do it because you're just not used to doing it. Keeping up your skills, keeping up your ability to change and so forth, that's the way to go. So look for ways to embrace some of the things that I'm talking about here that maybe aren't on your radar today. You can do it and get them going in your organization to success. Stream processing. Yeah, stream processing. ETL is great. It's legacy. It's going to be around. It's still going to be deployed. No doubt about it. But there's going to be cases, plenty of them where streaming data makes sense as well in the organization. So there's a lot of streaming, pure streaming applications out there. So data lakes for example, they seem to be evolving towards the cloud object storage which I mentioned before and stream computing as a matter in which to get that data in and out of the data lake. So streaming also makes sense when ETL is insufficient. So when is that? Data platforms operating at an enterprise-wide scale. Not a narrow scope, but an enterprise-wide scale. A high variety of data sources and real-time streaming data when the data is arriving in real-time manner on a very frequent basis. It can easily overwhelm ETL. With ETL, you can get scale or you can get performance usually out of your ETL, but it's hard to get both today at enterprise levels for some applications. Some applications. And that's why we're deploying more and more stream processing as a matter of fact. My going-in assumption to an application is we're doing stream. Talk me out of it. Tell me why we need to stay with ETL when you have the power, the same low TCO as you do with stream processing. So that's where I want to go with moving data. Moving data is still alive and well, by the way. It is alive and well and strong and going nowhere. So the whole idea of data integration strong as ever because of all the platforms we have the importance of data and so on. All the things I'm talking about here today. So let's leave this for a moment and I've decided to go really broad today and kind of fire hose you. Hopefully you're okay with this and hit on all the trends that I see because there's a lot of them today, you see. Artificial intelligence. I mean, you knew it was coming, right? Well, I think data's new highest use will be training AI algorithms. Yeah. AI is disruptive. We know this. That's true in many ways, shapes and forms in our enterprise. And what I am challenging my clients to do is anything that you have as an initiative that says BI change that to AI and think about that. Think about that. Think about doing that as AI. Now, if you think of the initiative as writing reports or building KPI or something of that nature, you're never going to get there in terms of thinking about it as AI. You have to get behind the reasoning for the report. The reasoning for the KPI and so on. You have to understand why that is being done. And oftentimes when you get to that why, that is what you can disrupt with artificial intelligence. So think about AI not necessarily just BI anymore and one way, sort of the low hanging fruit that I've been able to introduce AI into enterprises is through automation. So we see now the BI vendors are integrating a lot of AI to automate moving predictive insights from complex data into various processes throughout the enterprise. Yeah, AI. AI algorithms are going to be important and they are obviously trained on great data. So another mantra of mine is let no data escape. If you've been letting data escape you are just hamstringing your future AI efforts. So get them under control. A data lake might be appropriate for all that data if you don't have another place in mind. And I'm often asked by the way on this subject how do we architect our environments our data management environments for the inevitable AI future that we all see, right? And I don't know that I have a great answer to this. I've been listening to everybody on it. I've been thinking about it. I've been working in client environments on some of my thoughts and so on. I think the best answer I have to that question for now is to build a great data infrastructure. And by great data infrastructure I mean all the things I'm talking about here today. I'm talking about data lakes. I'm talking about having a great data warehouse in place not necessarily two or three or four but if you do have that many at least understand them and understand their distinctions and not have them just be chaotic messes that or whatever happens today happens today to them and we don't know and they're just going off on their own but have some data architecture over the top. Have it be able to be drawn on the back of a napkin drawn on a whiteboard consistently by many, many members of the team and so on. A real data architecture will go a long way in that inevitable AI future. Data is the foundation. Now when it comes to displaying data I know every enterprise out there is embracing data visualization to some degree but I think it's going to really take off because so many organizations are still doing things the legacy way and reporting and reporting is great obviously we need it obviously there's regulations we need to report ourselves you know into that arena there's going to be some battles not worth fighting okay here's your report fine but there's going to be some open users that we can really help out there and we can help them out with data visualization instead of reporting so again like before I said I challenged my clients any time they're thinking BI think AI, any time you're thinking reporting think data visualization think interactive data visualization maybe some of these some of you would look at this and go that's not very advanced but some of these you know mildly advanced at least you know data visualizations like Sparklines, bullet graphs, scatter plots and so on I have learned that this is a real skill here as well there's a great place in the organization for the foundational person the back end person but there's a great place in the organization for people that can do stuff like this for our user community so think about data visualization I think 2019 will go a long way in replacing a lot of the old reporting that we're still doing out there with data visualization alright what else self-service self-service now if I haven't already overloaded your plate you know as a data professional with things you need to be doing hang in there because I probably will by the end of this webinar but there's a lot of work for us to do building a great data foundation and there's got to be places where we let go and we let that go into not only the user community but also out there directly to the users meaning not necessarily the IT person that sits next to the user but the user themselves and technology what we're doing is we're delivering right curated data if we do these things metadata, data quality performance and we understand to some degree how that data is going to be used we will move in the right direction we will build what they need to do now where I don't want to see there be a big old drop off is in between again go back to this this is happened in spades over the years but a great organization will ward off this which is not having there be a big drop off between the builders and users and self-service is right for abuse and right for there to be a nice cliff there where nobody wants to get into in between and that has to be seamless that has there has to be a not not a bridge but the foundation built completely between the builders and the users and an understanding at an enterprise level of what each respective party respond with the responsibility is of each respective party here now the technology organization the technology professional can focus on more value-added activities I already mentioned developing new applications not just responding with your pad in hand what may I do for you today but building the new applications of the future yes we are going to be increasingly responsible for that I think a good way to look at a future organization is is it technology driven it's a strategy driven by technology people that really understand technology or is it being driven from someplace else incorporating new technologies to improve the performance of what you have in place today constantly improving constantly raising that foundation and technology therefore can become more of a partner rather than a roadblock to business users and there's still still many abusive relationships out there between the builders and the users and I'm going to touch on that a little bit later so I'll drop off of this right now but I'm going to stay on the theme of people yeah I'm going to talk about the chief data officer here because the chief data officer has gone mainstream and it certainly appears it's going to stay that way now this may have a little ramification on some of the oldies but goodies that we enjoy in terms of titles around our organization like the CIO because there's plenty of work for the CIO but then again if you look at it there's so much work for the CIO that sometimes it takes a CDO and some of the other things I'm going to be talking about on the next slide so we got to figure this out as organizations who's going to do what but the chief data officer there has become some I'll say some subtraction around what that CDO does for the organization and what that role can be and I've done some writing on this and I talked about these things these objectives manage the project portfolio may not occur to everybody that is one of the top responsibilities of the CDO but I see it as a top responsibility of the CDO I see the CDO adding to and removing things from the project portfolio of the enterprise based upon data yes that's right that's where data needs to be I see a high responsibility as creating accountability and a lot of this accountability comes in the form of you know where I'm going data governance the CDO is responsible for the data governance in the organization making sure we have data governance is at a high level of maturity now the CDO may or may not participate in, lead and so on but the CDO has got to make sure it's there and protecting the company it's become so important today but don't become overwhelmed with GDPR and the like as the CDO but that is part of the CDO's responsibility making sure that we hit the markers when it comes to protecting the company and GDPR may or may not impact your organization directly but it's the idea of it should create some impact on your organization whether you have customers in the EU or people data from the EU or not because something like that is probably working its way around the globe for us here in America and it's only a good idea to understand our business to the degree that you need to pursue that strategy now I'm going to pick things up a little bit I've talked about this somewhat organizations acknowledge that there is a chief information architect and a chief analytics officer in addition to that CDO that's right something at the C level that looks out for analytics and information and somebody that understands what we're talking about here today but understands it to some degree of depth now what we're also seeing is that those data science pioneers out there are getting locked in that's right if you are a data science pioneer you almost already know it today there's still the door is still kind of cracked open for the rest of us but data science pioneers are starting to take off and what they do is they let the data speak what is the data telling us about this business let's believe it they are using statistical models and machine learning and these have deep implications to how we work and we're seeing work changed as a result of the work of data science already and these organizations are dealing in algorithm management they are trading algorithms throughout the organization already they have means for doing that and they have metadata around their algorithms and they have Kubernetes and different places different ways that they can deploy those algorithms upon different data throughout the organization so that's what those organizations are becoming and if you can't even think that way you need to start moving in this direction of data science because what happened is there were some fake it till you make it data scientists out there in the past few years that's right they weren't trained PhDs they were business analysts but some of them actually made it they tried and they did it and they are data scientists and they did the thing that you have to do at first which is the fake it till you make it part of it so before you can become a great data scientist you got to get in there and start trying things out and that is what that first wave of data science leaders did so congrats to them but the door still open for others who do not consider themselves in that category I think it's going to be very important to be in that category I think it's going to be very important to company success let's talk about data team dynamics here because I've talked a little bit about this already the notion of the IT professional it's alive and well and there is an acknowledgement of the need for data deployments to be near the business unit and that's being reflected in organizational charts today so I'm seeing a sea change in what organizational charts look like I'm seeing an acknowledgement now finally that there are technology pockets out in organizations and it's okay we can be above board we don't have to be hidden in the closet about this yes we have technology here and yes it's okay and yes we have a good balance with central IT but another thing about data team dynamics that I'm seeing and I'm grateful for this by the way as a consultant there is a distance to change that we've had over the past decade probably decades you know it's still there of course but it's starting to get knocked down because organizations are starting to understand how important it is that we do these things and they can't be listening to the people the naysayers that would have to stay the course don't get in the cloud we don't embrace agile development that's for something else that's not for us never mind the data lake or big data or Hadoop who needs that stuff why do we need to change from reporting to visualization the people that are doing this master data management I can do that for you here just let me build a little database all that sort of mentality is starting to not play well in an enterprise we're starting to learn and that's good and our skills most of us on a call like this from the analytical side of things we have analytical skills our skills are necessary now not only in this environment here that we're used to the data warehouse, the data lake, the data mart but guess what our skills are now being needed back in the operational arena and that line between operational and analytical it's starting to get a little bit blurry okay analytics are starting to be embedded in operations today sometimes they come from the things that we learned on the post operational side of things but they are needed extremely a lot in the operational arena of our business speaking of which what have we been doing over the past few years to support that we have been building no-SQL databases to support that, yeah not just databases but no-SQL databases have been sort of modern and relevant to operational big data platform selection some of us have seen a slide like this showing all the different flavors of operational big data but SQL hasn't been sitting around on its laurels and saying okay I seed I seed the operational space our bread and butter I seed that to no-SQL no, SQL has been doing a lot that makes it still relevant for operations most of it or a lot of it has to do with its embracing of the cloud those databases becoming more cloud-adoptive and performance and the optimizer and so on SQL did not yield ground and I think that I'm not going to say no-SQL is no more or anything quite like that but I am saying that SQL is a strong competitor in the operational arena and if you are thinking that way you are probably in an okay direction there notice I left graph off of that off of that graph has some unique capabilities and now I'm just going to talk about something really quick here about the future new data sets that are coming our way there are several that are very interesting that it's a lot of fun to be looking out into the data community and understanding what data sets are available what data has gone for sale what data companies are collecting which may be available in the future and so on but one category of data set is bio-data and if you think about it bio-data could render all analytics obsolete because analytics are simply about trying to predict the future if you think about it well bio-data is the future that is going to tell you exactly what the future is going to be for that person theoretically anyway some traps and trepidations on the path there we already are seeing it but that is part of the inevitable AI future and data sets of that nature are going to have to be embraced by enterprises so again great data foundation embrace data science you'll be more than ready for something like that so in conclusion of the formal part here today there's more maturity in moving imperfectly than in merely perfecting defining the shortcomings anybody can sit here and point out the shortcomings in the environment but what about moving forward taking action taking action that gets to production not just gets to QA not just gets through development but gets to production that's where it counts we have to be thinking the full life cycle build credibility don't be afraid to fail I said fake it till you make it about the data scientist don't talk yourself out of having a new beginning be open that resistance that maybe you have or somebody else you've seen have that's not going to play well so we go forward it's about making progress and it's about the journey and we have to be on a journey forward we have to be embracing the trends that are going to be sticking around and I've shared with you from my perspective what I think some of those are here today for 2019 and beyond and before we launch into some Q&A here hopefully you've given me some of your questions I apologize I failed to say to enter them as we move along here but hopefully you've done that and you can still do that right now enter some questions for me for the next few minutes but I'll see you again the second Thursday of every month same time same bat channel and feel free to use the hashtag ADV analytics for anything you hear here or want me to necessarily see out there on so with that Shannon do we have any Q&A? William thank you so much for kicking off the series here what a great presentation and we've had lots of comments of what a great presentation no direct questions yet here although we had a huge yay year of MDM fingers crossed I bet I know where some that's coming from you know you're in a data management webinar when that's awesome so yes if you have questions feel free to submit them in the Q&A in the bottom right hand corner and of course just to remind everybody I will send a follow-up email from this presentation with links to the slides links to the recording by end of day Monday for this presentation so I'll see let me dive right in here great presentation lots of great presentation comments everyone's loving it but nobody has any questions you were very thorough I'll be more thorough and mention the Q&A part at the beginning next time we've got the lots of questions because Shannon you just answered that William how many companies have been using ontology based models do you know about I don't know that hasn't been part of my part of my surveys part of some of the industries I've been in so I don't really know about that sure so let's talk a little bit yes go ahead I'm just apologizing that I don't no worries William let's talk a little bit about the series so what are we going to be covering through the year you talked about how this we're kind of queuing everything up for the rest of the year yeah we are it's just an exciting series here that I'm just so passionate about I feel like enterprises need to embrace these things but I also feel like people need to embrace these things and one of the very best things that we call in my career now over a couple of decades or more actually is helping people helping people learn more, do more be more effective in their organizations that's my passion and that's what I'm going that's going to be my bent as I deliver these presentations it's going to be practical and actionable and we're going to drill in on things like the cloud versus HDFS I think we're going to have one on the different databases that are out there we're going to have one on streaming data and we're going to have one on getting the enterprise ready for AI with data I'll do a lot of these things here today but obviously I didn't have time to drill in on them but I will over the course this year so be sure to come on back I'm excited for that I love the topic of AI I'm just a geek what can I say we have another question in here for you are companies using GraphDB to analyze ERP structured and unstructured data and how effectively has these worked over a traditional SQL or in memory HANA type databases there's more than one way to skin a cat isn't there but it is important to get into one of the best ways to do it and it all depends now I didn't mention in memory databases but obviously that's the highest performing category of database but a lot of what people are telling me is that it's too expensive for any application that we can find here today so I am leaning on graph databases as a means to do that do anything that has to do with relationships networks, hierarchies when I hear these words I think about a graph database I will not be shy to introduce a graph database into that situation over a relational database because if it's truly a graph workload and that would obviously merit a separate 20 question discussion but if it were truly a graph workload it's going to do so much better in a graph database not just from the algorithms that you have access to but the ability to get to the relative performance connection points between different nodes on the graph these are things that are hard in relational databases and frankly I have done it with relational databases before graph databases were a thing and I had the self referencing tables and the display part forget it, I couldn't do that but I could do the sequel to make it happen but it was just so clumsy and long so I think graph databases far less code can achieve what sequel can do and then some obviously so you're going to avail yourself to a lot more if you do think about graph databases William thank you that's perfect and that just brings us right to the top of the hour again I'll send you over the comments a lot of great comments here just thanks for the great presentation and indeed it was a great presentation so excited to kick off this series this way and thanks everybody who has joined us and the attendees who are so engaged in everything we did we just love it hope to see you all next month or second Thursday and again I will send a follow up email by end of the day Monday with links to the slides and the recording I hope everyone has a great day William again thank you so much everybody have a great day bye bye