 Hello and welcome. My name is Shannon Kemp and I'm the Executive Editor of Data Diversity. We'd like to thank you for joining this month's installment of the Monthly Data Diversity Webinar series, CDO Vision. This series is designed to give you around education on data strategy topics in addition to our annual face-to-face CDO Vision event. We're already well underway planning for next year's event to be held in Atlanta, Georgia. This month, John and Kelly will be discussing big data strategies, organizational structure and technology. Just a couple of points to get us started. Due to the large number of people that attend these sessions, you will be muted during the webinar. For questions, you'll be collecting them by the Q&A in the bottom right hand corner of your screen. Or if you'd like to tweet, we encourage you to share with highlights or questions by Twitter using hashtag CDOVision. As always, we will send a follow-up email within two business days containing links to the slides, the recording of the session and additional information requested throughout the webinar. Now, let me introduce to you our speakers for today. Well-known industry analyst Joan Miley is a business technology thought leader and recognizes authority in all aspects of enterprise information management, with 30 years' experience in planning, project management, improving IT organizations, and successful implementation of information systems. He is the president and chief delivery officer at First San Francisco Partners. Also joining us is Kelly O'Neill. Kelly is the founder and CEO of First San Francisco Partners. Having worked with the software and systems providers key to the formulation of enterprise information management, Kelly has played important roles in many of the groundbreaking initiatives that confirm the value of EIM to the enterprise. Recognizing an unmet need for clear guidance and advice on the intricacies of implementing EIM solutions, she founded First San Francisco Partners in early 2007. You can meet both of them at the upcoming university conference, Enterprise Dataversity to be held in Chicago at the end of September. And with that, I will turn it over to John and Kelly to get today's webinar started. Hello and welcome. Good morning. Thank you very much. Good afternoon. Or good evening, depending on where you may be. I'm obviously John, and that other person talking is Kelly, and we're going to be talking about a big data update, or is it big data update, or an update on big data? It depends on where you put, I guess, the emphasis on the syllables there. Then again, big data is data, and it requires data management. And I think you'll see some things during our talk here today that reinforces that we're taking a CDO's view of big data today, so we will not get extremely technical, but that is a point we will cover is the technical nature of this area. But the fact remains that most CDOs or many CDOs come into existence because of the rise of analytics and its supporting technology, which we call big data. We've got some signs of success here. Kelly is the genius that created these slides, and I see that she's muted, but I'd like her to unmute, because she's a lot younger than I and understands the modern music here. So on the left, Kelly, that's a band, right? Yes, yes, and one of the most successful things about big data is that it's so successful it has a band named after it, or rather a production company, I guess. But anyway, I think that that's one of the things that always makes me laugh, is that big data is a band, as well as the technology and capability and all of that. Yeah, but that is cool. It is entered, so we've both been doing this work for a while, some of us longer than others, but we've never seen data modeling enter pop culture, right? I've never seen a band called Data Modellers, right? Be a little weird. Or do they play DDW? Oh, never. Well, that's true. That's the jam. But they don't even call themselves Data Modellers. But yeah, but here you have something that has entered the mainstream, and that is pretty significant. Now, the other illustration, which Kelly provided when we put this together, was something more from my era, good old Clara Peller wears the beef. And there are some challenges because even with all the hype, and this has been a super-hyped area, right? There's still some folks saying that really isn't delivering on the promise, although I think that would be maybe better in this area than others. But we do have some doubting Thomas's out there. And for those of you that are under the age of 35, you're just going to have to Google Clara Pella and Wendy's hamburgers. So why are we here today? Well, Kelly and I are here to take a practical look at advanced analytics and big data. And just to set the semantics here, to us, big data is the enabling technology for advanced and sophisticated analytics. You can do advanced analytics without big data, okay, big data being that volume, velocity, veracity. What's the other one? So just a bunch of these around that thing. We want to talk about it because a lot of money is being spent, and there's talk that the return is not in proportion to the amount being spent. And out of all this, we want to give you some practical advice, because that's after all what we do on these conferences on our little webinars here. And we want to take that CDO strategic and the planning vision. What's the fourth V, Kelly? Volatility. Thank you. Volatility. You think I, of all people, would know about volatility. Anyway, so, you know, we're promised deeper insights. And Kelly can confirm hearing this with our clients no results yet. We've got several clients in the last few years, right, that have invested a lot and haven't seen a whole lot, a great deal out of it. And then, of course, hiring people to understand all of this has become very, very difficult. We also hear something like the larger quantities of data dampen the errors, and therefore this data lake will produce stuff. But what we get are some pretty familiar sounding things. They don't believe the results. They can't find anything. And what answer they do get, they don't know what it means. And so, Kelly, anything to add to that? The evidence, is there something going on here that we need to fix with big data? Anything else that you have seen? Well, I think just to add some color to those points, and one of the things that we did earlier, gosh, I think it might have been last year, was a study around business intelligence analytics and predictive analytics. And a lot of big data falls into that predictive analytics category. And this is a university research paper, if anyone remembers reading it. But it was quite interesting in the sense that there was, yes, there's a tremendous amount of investment and hype behind big data with indeterminate sort of insights. That the insights that they're looking for a little bit more of a needle in the haystack. And I do think that since then, there has been a change. And we'll talk a little bit about what has changed and why people are starting to get more insights. But one of the biggest challenges, as John's highlighted here, is that there's still a lack of skills on the market. And despite the emergence of study programs and university programs, still people are getting a large volume of the information that they have around big data from vendors and from the web. So it's less around practical knowledge and having been there and done that, which is what people are looking for when they hire people. So it's this growth cycle that has, we're not into this place of great resources on the market. I mean, we feel the same thing when we're looking for our own resources within our own company, that it is hard to find people who have been there, done that, and learned the hard way, which is really important. So I just wanted to provide some more detail and color around that. Well, oh, which leads into this slide, John, if it's okay if I keep going. That there's the current state of big data efforts is either extremely exhilarating for some or extremely exacerbating for others, exasperating, excuse me, for others. And it's one of the reasons that this photo kind of captured it for me, is it tends to be a younger, more, I guess, less jaded group of individuals that thinks that it's really exhilarating and so exciting. And it tends to be the older generation that tends to find it exasperating, frustrating, and not seeing the value. And so I think that one of the things that we need to consider as an industry is trying to tie those two worlds together, because the reality is that it's both. It's hugely exhilarating and great opportunity, but at the same time overcoming some of those frustrations and continuing to move on and keep trying. Okay. So we're going to talk about some of the things we're seeing that are providing these challenges to big data. And we're going back to the root cause slide that we have used on and off this year, because it is a result of some good research among us and some of our peers. And we mentioned those folks earlier, James Price, Tom Redmond, and Miguel Ray, Doug Laney, and Kelly and I. So from, we'll get into more details on these, but very quickly, the operating frameworks around big data and analytics in terms of governance and leadership or some areas we'll explore, the justification and how it's being used. We will look at the awareness and expectations around things, this whole business of, yes, we're going to be a data-driven organization. We've even had a client in the last few years that said they're going to turn into a data company and then proceeded to just have no really good clear vision of what that really meant. Then, of course, the alignment Kelly mentioned earlier, just conclusions and what do we want out of that, and then how do you enable this to move forward? So we're going to frame that in our root cause type thing. And if you need, just take away for the whole thing or you need to go right now and have lunch somewhere, I think you can see that there's a lot common to all of our enterprise information management data governance type disciplines here. Big data is not on its own in terms of some of the critical success factors. So first we're going to talk about organization and management and the alignment type things, diving into our details a little bit more. And I'll be talking about these and Kelly will provide some commentary here along the way. First of all, that operating framework, when we say no governance inadequate leadership, the evidence here is that data lakes are becoming swamps. We don't want to belabor this point. It's kind of all over the place already in the big data literature. But GWIS, people just grab it and throw it in there and expect miracles and it isn't happening. Or it requires someone who is so extremely trained to plow their way through the data that you often wonder if there's a return on their results. If you're paying someone a real good, healthy six-figure salary and they're coming up with five-figure benefits off of their analysis, why are we doing this? And that gets you to the walk the data talk. If you're going to be data driven, if you say you're going to understand what data does, you need to be willing to listen to that. There are a lot of really good examples of where big data are helping companies. But in every one of those companies, when those numbers first came out and it said company A, cogs well cogs, you need to go left and space these brackets, you need to go right. The first thing those cultures did and said, we don't believe that number because sometimes the results are culturally counterintuitive and it's really hard to do that. Management is looking for miracles and they think it's the wrong answer or they're looking for a one-off miracle and this is like anything else. It takes some work to get there. And the lack of awareness of what you can expect from that, the cultural aspects of changing the way you actually make decisions, puts you into an immature situation very often, requiring you to define what data driven really, really means. So in those three areas, Kelly, anything to add to those? You know, I think that some of this, and we will talk about this a little bit later in the slides, is this catch 22 around what we can accomplish and what are the challenges. And so some of the challenges are based on pure lack of systems and capabilities to accomplish it. For example, no governance. So the governance around big data has always been a struggle because of lack of security in the platforms, lack of metadata in the platforms. And guess what, the good news is, is the technology providers recognize those gaps and they're building towards those right now. And so then what that also means is then as, you know, data governance folks, we need to also adjust and recognize, so what is the metadata capabilities within big data and how do we apply our traditional policies, processes and standards to those things that are relevant and meaningful within that category as well. So I just wanted to say that there's a lot of these issues are rapidly evolving because there is a lot of hype around this. And so therefore people are still paying attention to it. And so it is rapidly evolving. And the challenge of the CDO is looking at current state versus the in-flight rapid evolvement. And therefore also, how does that sync up with the goals and expectations of the future state? Spot on, spot on. The business alignment aspect, miracle hunting versus solid risk. And yes, you hear about miracles, right? You hear about organizations that found that little nugget. But where you're really seeing the real payoffs are organizations that find some change in their customer touch points or an operational manifestation of their analysis. But they were looking for that. They had an intent to do that. They had a business strategy that required them to look in for a solution to achieve a strategy. So stuff isn't going to leap off and stick to your face. And I know it sounds strange, but we have actually witnessed companies have spent a lot of money because they were told they needed to do analytics by somebody and they've spent a lot of money and they're waiting for some miracle to leap off. But there's no air cover for the people to solve a business problem. The other thing you start to get into is folks get nervous about the costs and the results. And you get into that. It doesn't matter what project is a very common symptom in our information world. At some point, someone says, show me some value of this. I'm tired of waiting. And this is no lie, no fib. We've seen this. Kelly can corroborate it or else she'll just stay on mute and pretend I didn't say it. But they end up doing production reporting or operational reports out of an eight-figure capital investment in hardware and software. Because now a sudden someone's nervous that they have to deliver something, which is just taking it entirely the wrong directions. We've got a lot of common problems, which we've kind of touched on and we'll just skip over the metadata part. But a lot of times you throw all that data in there. It's still unsuitable for analytics, right? It's just not there. And no one can find anything without extreme amounts of intimate knowledge with things. And then there's the whole problem of everyone going out and buying their toys first and figuring out what they do with it. And this is becoming, Kelly might even have to rein me in and reach through the phone and slap me, but this is becoming a disease. Kelly, I don't know if we want to elaborate on that or not in this world of big data. But buying the technology first and expecting the miracle is becoming almost a theater of the absurd with a lot of folks that we get called in to help. I don't know if you have any more comments on this or the other parts of the slide. I think one of the reasons, and there's always been this trend, especially being out in Silicon Valley where there is hype around software and technology capabilities. And so there's always the, I'll buy the software and that will make all things better. And I think that that occurred in spades with big data, frankly, because nobody knew what else to do. And so they would buy the software and realize that, oh my gosh, I really don't have anything else to do with this, you know, big Hadoop database. So I may as well use it as really cheap storage and show some value that way so that it's not seen as a throw away. So I just think it's reflective of other trends and it just happened to be people got so excited about that they did go out and buy the tools and then they reigned it back a bit and are now reassessing. What does this really mean to me? And now that I already have this infrastructure, now I have a more intelligent way of taking advantage of it. Well, we're going to talk now about at least organization wise how to fix it. Kelly, this is something that we talked earlier about you walking us through that and now provide the commentary. Sure. And so one of the things that, you know, obviously John and I have a perspective here around the role of governance as being a conduit for business goals, expectations, and drivers as it pertains to data. And so one of the things that we're seeing is that the alignment between that governance office and the analytics group, because we believe that big data is a tool for an overarching analytic strategy, having that tight alignment is really one of the best ways to create the and ensure that there is the understanding of the data and the trust in the result and also to identify the requirements that come from the business side. So if we look at this model for just a minute, this is a sample model that articulates how an analytics organization and a governance organization can work together and be synergistic. So in this organization, there was a role of the head of analytics and the analytics in that organization was all kinds of analysis, except for operational and managerial reporting. So the tools were not necessarily all big data. Some were traditional analytics tools, but it was all in the one umbrella of analytics and they just identified the resources needed to solve the problem that they were looking to solve. A peer organization and the governance office and that peer organization worked quite closely with the analytics group and the analytics groups fed requirements into the data governance office and the data governance office supported the analytics group and also fed requirements back in to analytics when they were seeing demand for certain sorts of data and prioritization of data. Now if we look kind of one step lower than the business analytics and the data governance office, there's this whole community of people that execute on the governance guidelines and execute on the analytics guidelines. So whether they're direct reports or whether they're a virtual organization, I don't think really matters for the most part, but something to recognize is when you have the lines of business represented in a governance organization, those participants, whether they're called data stewards or they're called business stewards or whether they're just called data subject matter experts, many times they could be those same people that work within an analytics organization to deliver analytics and the outcomes of a big data environment into the different lines of business. So there's a lot of cross-pollinization that could occur between an analytics organization and a governance organization to ensure that the data is trusted, so that the data that people believe the outcome of an analytics exercise and a big data exercise, so that there is the level of control to ensure that the data lake doesn't become the data swamp and it becomes very synergistic because you've got this cross-pollinization. And the circle could even stretch a little bit wider in the sense that there's a data management group that supports from a technology perspective and the IT side in the same way that there's a data management group that supports the technology on the data governance side. And as the tools develop more sophistication in things like data quality and metadata and traceability and things like that, those roles will also become much more similar and much more synergistic. So one of the points I think that we want to provide as a takeaway from this webinar is that alignment between the traditional data organizations and these new big data and analytics organizations to make sure that there is trust, understanding, and usage and consumption of the output of that data to make sure that the values delivered. So we spent first third of this talk talking about the problems and I just want to reiterate because we tend to be practical here is that this is not an org chart but this shows a bunch of elements, right Kelly, to help address some of these problems and root cause type things. We've talked about one that I wanted to re-emphasize was this enterprise infrastructure committee. Infrastructure or enterprise architecture or oversight needs to be at the table. There is a tendency to have these large big data, you know, either it's outsourced, I'm not outsourced, cloud, clouded, is that a word, it's clouded, or a business group goes off and acquires this or IT does it because they were told to do it and architecture, infrastructure is on the outside looking in. It really behooves you in the long term and we'll hit this one again here in the second half of our talk is it really behooves you to not do that. That's where you get into a lot of alignment issues, a lot of contradiction issues, a lot of inability to find things and there's a lot happening with technology to allow to support that the big data environment is not off on us. This tight alignment between the data governance office or operating model for data governance and then the operating or the engagement model of analytics really, really a powerful tool. You don't want them off on their own. Very symbolically the last thing on this is we put the CDO at the top. The CDO, the top data job could be the CIO and some organizations but somebody has an eye on both legs. Somebody has as an awareness of what's going on amongst all of these things so we don't have a disconnect. Some accountability for that. And in some organizations it's tightly aligned where this could look more like an organizational chart but in some organizations analytics doesn't report up through the chief data officer hence the requirement for the virtual working together in collaboration. Right. And I think the evidence is strong that organizations where the CDO or the top data job or the CIO has been left outside looking in at a big data effort and you have seen less than sterling successes in a lot of those. Awesome. Very good. Just a reminder here before we move on we do love answering questions. If you hear anything that you disagree with or want to have clarified or expanded upon or just want to talk then please send us a question and we will be happy to answer that question, enter it in the question part and I will make sure we answer that. Let's move on now to the technology. We talked about kind of a visit of the state of the art and it is important. Kelly will agree with you that I tend to put the technology second and why are we doing this and what for first. We kind of do our work that way at First San Francisco but we do need to talk about it because my goodness it is changing and there are some really cool things going on. This whole company called FirstMark. Again Kelly we talked about big data being a rock band or a media company. I saw this the first time I saw this graphic was on social media and it was on somebody's Facebook page that is on it and said look at the cool picture I found. I went and dug this up. This is the universe of technology now. I guess my first thought is if you are going to go out and buy something where are you going to start without a plan? Good heavens. This is easy. If you down there on the bottom you see FirstMark or you can just go the big data landscape 16 and Google that. You can go find a nice big picture. It is a lovely little site, a couple of really smart people that are keeping abreast of things but holy cow. Obviously there is a lot of money running around here. Where do you start? You need to have a little bit more idea of your scenario. That is kind of how we are going to frame the technology and changes here are around some various scenarios. Things are evolving. The first big change I think that you really need to be aware of no matter where you are around big data and head towards this by the way is this technical versus business conversation. I will admit about five or six years ago to being dubious about big data. There was a part of me that was like it is going to be like knowledge management was. It is going to sound really cool or artificial intelligence and then it is going to kind of disappear into the woods there for a while. That did not happen. What did happen is it became an extremely technical thing. If you look on any of the blog type social medias like the tumblers and things like that and you talk about people that are engaged in this, they are all highly technical and it is to a sense isolated a really cool solution from the people that can use the solution. The market is driving very rapidly towards simpler terms. Easier explanation of things. You do not need to know a whole bunch of acronyms that you do. Do not understand to use the tools. An outshot of that is the return to known access method. So SQL, whether you are a no SQL fan or not, a lot of people know SQL. That is how they like to look at data. Everyone has a row and column mindset. Most business analysts view the world as a spreadsheet. So those types of connectors to common tools that they are familiar with SQL will more and more dominating as well as just putting the data in a proper place. You are going to start the concept of the data pond as subset of the data lake where the data is treated so that it is more consumable by more people other than a data scientist. You are going to see a lot of that. Rather than doing the report just to prove you got something out of this pile of data, you also are going to see especially around Internet of Things, a lot more mission critical, managerial, operational type things where you are driving the numerical. We used to call it real time analytics. So you are driving the conclusion but more and more this is becoming real and it is going to just keep happening. Old data warehouses are going away. It is being integrated into this. In other words, if someone finds it easier to use an analytic or an algorithmic type thing to create a data set that is useful on an operational basis, then that is getting put into a traditional row and column ODS or traditional row and column data warehouse. That is fine. Find the right spot for the data. It is not necessarily old and new. There is this talk of the data warehouses dead. It is a euphemism for change but there is always going to be the need for a static historic data source, etc. But the conversation is changing but we are not getting rid of the old. What I will do is I will cover the next one and then I will bring Kelly in here for any comments on the next one. We also now have, believe it or not, because we measure all of this in dog's years where one year of tech is seven years of normal human progress, it seems. You are starting to see, well, that is old technology. A lot of your original Hadoop lakes and things like that are getting re-architected and restructured. Our advice to you is do not back away from that. Performance, some of this stuff is slow and there are ways to speed it up now. Embrace that. Integrate it into the rest of your enterprise architecture. The big data analytics cluster of technology should not be a one-off isolated bundle of stuff that a bunch of data scientists use. It is an integral part of your information asset portfolio needs to be treated as such. Now, the old stuff, what about the new big? Well, Spark is taken over as even a substitute for Hadoop in some areas but certainly the go-to supplement wrap around the way to get to it, et cetera, et cetera. Again, I mentioned SQL earlier. No SQL storming more and more and more all the time. Use of semantic web technologies, use of graph databases that are wrapped around or fed from these structures. Awesome, powerful, powerful stuff. That is why you have to work this all into your enterprise architecture. If it is woven together the right way, it can be really, really powerful. One area that I am going to elaborate on is the cost of ownership. Kelly mentioned it, everyone went after it due because I can put all this stuff in there and I don't have to spend the money organizing it and we can put so much stuff in it, we can't even imagine how much stuff it is. But you know what, we are hitting cost constraints now. In the last couple of years, the organization said even if it is in the cloud and someone else is doing it, it is still starting to cost me an awful lot of money. Even if it is only pennies per terabyte, if I have all these terabytes and I am hammering all the time and it is always changing and I have to pay someone and take care of this, it is getting expensive. Now add to that the cost of scarce labor. You are talking some pretty healthy salaries. So those are the, I am sorry, a bunch of questions just popped in and it totally distracted me. I apologize for that. There is a lot of cool pictures out there on these server installations that are next to a river because that is the only way you can get up water to cool off the servers. The scale of this stuff is incredible and it doesn't matter how cheap it is per byte or per gig or whatever, when you add up all those numbers it gets really, really, really expensive. Add in some expensive labor, add in some training costs, add in the cultural impact. This is expensive and it is being seen. It is not going unnoticed at any executive suite and you are going to start to see in the next year and we strongly advise our clients in the analytics world to start to look at cost of ownership and start to say where should we put this data so it is used the best way. And that means best means a lot of things but one of the aspects of best is the most economical and efficient way to do it. If we need to get it out of a hard to find type thing where I need a scientist to do it all the time and put it into something a bit more mechanical like a warehouse, well then by gosh we are going to do that and go from there. The next thing is mostly around the internet of things. We are working on something like that now and Kelly can maybe chime in on that one but everyone knows the internet of things is kind of the next cool thing or the next permutation of this area. We are well into it now. I am not so sure I want my refrigerator telling me or telling anybody else what I am doing but that is kind of where we are headed. Of course that makes privacy a big thing too but what we are seeing here, these are not business events. Me opening the refrigerator and taking something out of the freezer that weighs about five pounds so the refrigerator says John is defrosting a turkey or defrosting a roast or something like that. Fine. That is not a business event that I want to keep. The relevance of that data is not persistent. So there is a lot of churn in these databases and what we are seeing is the mindset of leaving stuff out there and it is totally irrelevant because you are using this stuff to tweak customers, reach out to consumers, adjust touch points in your customer life cycles, all this kind of stuff. The data is no good after a month or six months or something like that. The data erodes and it is half-life is really, really short and so you need to adjust to that. We thought an operational data store at one time was hairy and scary then real time a BI was hairy and scary. Now we are getting into petabytes and exabytes of data that are churning and turning themselves over maybe dozens of times a year. That has a lot of technology considerations too. I am going to stop on my high rate on the technology here and I will turn it over to Kelly for her contribution to our technology things. Let's see. Is she unmuting? There she goes. Hello Kelly. Welcome back. Sure. I think it is really just about as we learn understanding what is fit for purpose and what is best to be used for these different things. Part of what we go through, which is a positive thing, is the learning process of what works, what does not work, what is a good cost of ownership, what is it really valuable to be used for. Ideally we are failing fast. We are learning in a very quick way that it does not make sense to throw out the data warehouse. What makes sense is to use the data warehouse for the typical sort of descriptive and diagnostic sorts of analytics. If we take kind of the Gartner approach of descriptive, diagnostic, predictive and prescriptive. Sorry I cannot get that word out. There are reasons for maintaining some of the quote unquote old technologies as we look towards predictive and prescriptive and leveraging some of the high volume processing to look for some of those needles in the haystack which prescriptive might be more aligned to because we don't know necessarily what we are looking for. Again, just going back to leveraging the technologies that are best suited for that purpose. I guess dealing with executive expectations to ensure that there is enough of an understanding that you don't just do big data, that you actually have a purpose behind what you are trying to accomplish with big data rather than just doing it because somebody came back from a conference and that is what is on everybody else's agenda so it should be on mine as well. Yeah, absolutely. We threw in a reference architecture. We had this, actually we went and just, we update this once or twice a year and this little presentation was a good reason to go back and visit it. Visit it, visit it. And Kelly and I can, we will walk through this and I do see questions coming in and I also see some really cool editorial comments which I will cover those and we will cover those. Let's see here. Sources in the reference architecture. We have structured sources internal and don't forget external. A lot of organizations are buying data and they want to throw this all in there as well. And even if you have a data lake or some similar big data construct, you have your internal business events, et cetera, and you go external from somebody, those data models never line up. That problem has never gone away. It never will. And the analytics are difficult to do without some reconciliation. I kind of invented a term here, variable structure. Internet of things could be considered structured, it could be considered unstructured or minimal structure. The fact is that a lot of the sources that are coming in are from so many protocols and so many types of latencies in this area. It is a discipline in itself. That's my opinion. We can certainly argue with me on that one but that's my opinion. Then we have semi-structured and here we go with the initials XML and JSON, that type of source. And then the unstructured with click streams, PDFs, consumer sentiment surveys, those kinds of things. Acquisition wise, besides the typical staging of the semi-structured and unstructured, one thing we cannot overlook now is the streaming. This is very quick data. It has very short half-life. It is streaming. It is here. It is gone. It's like a little data firefly. I don't mean to be that poetic but that's really kind of what it is. And that is a separate type of acquisition and handling process. The management, what we had last year was cleanse and risk transform, create the golden record. You probably would like to have some master data, reference data that bounce these things off, you have good dimensions. But one, the management area, the thing we've added now in 16 is total cost of ownership. It can be burdensome. It can hurt. And so it needs to be brought to light in the management of these things. In addition to our persistent stores of the traditional warehouse, the ODS data mark, we have Hadoop. We have also data lakes and ponds perhaps within your structure. And we also have Spark now from a reference architecture as something that can be standalone in lieu of Hadoop or in supplement to Hadoop. There's a lot of other technologies on the way too. I'm not going to get into them just out of the interest of time today. It's worth research. But I like that chart we showed. There's an awful lot on there. In terms of delivery of analytics, we have added other data retrieval emphasis on there is SQL or no SQL type access and delivery to these. And also staging of these in no SQL type data stores like a graph data base like we mentioned or maybe a more traditional SQL thing but being more useful at a touch point like mobile BI or something like that. So the reference architecture is changing. It's evolving. Kelly, anything to chime in on our reference architecture here? No, except just a shout out to one of our consultants, Tamara Saliman, who helped us with the 1.0 version of this. So this is based on that original slide that she kind of crafted. Very good. Thank you very much. The other thing I did was I noticed I made data governance really thick and big and really stand out, folks, because you've got to really wrap some governance around this or where you're not getting very, very effective. So on that, we'll just move on here to just kind of a wrap up. And Kelly and I will do a bit of a wrap up here on this slide. And then we'll move on to the questions and answers. And once again, folks, we have looks like enough questions for about 10 minutes. So there's room for more questions. So please, please, please ask us questions. This is your chance to get insight and cool stuff. And please take advantage of it. Resource wise, Kelly, let's just bang through these together here. We talked about skill resources, but we also talked about that operating frameworks where we've got to interact with the other elements of enterprise information management. We don't want to be standalone, right? Absolutely. Absolutely. It is a collaborative approach in order to make sure that it is an end-to-end trustworthy exercise. Good. On the alignment side, the descriptive business intelligence isn't going away. The Gartner group way of looking at it is kind of a cool way to look at it. And descriptive BI is here. It's here forever. A lot of clients confuse analytics and BI, especially folks that aren't immersed in the business as, say, those of you listening or Kelly or I or folks working with us or other companies, too. To them, it's just a way to get to an answer, right, Kelly? There is no discrimination between them. That's right. That's right. Yeah. And to treat it as something special and use big fancy words and initials is really doing your peers and your organization a disservice. I don't want to sound harsh about it, but you can come off arrogantly as an arrogant, not intending to be, but as an old-fashioned technologist and being arrogant with technology and really not help your cause at all in the data management world. We also need explicitly value, right? The answer is not going to leave out. Let's have a problem to solve. Let's have at least a business goal or strategy that we try to support something like that. But Kelly, there's almost smacks of the early 90s bill that the name will come. Maybe not so much that, but we've got to be careful of that, right? For sure. For sure. Because guess what? There's going to be a new thing that's all hyped up within the next seven years for sure, and then this will become old. Oh, and you can send me the invite to the webinar you're doing. I'll be glad to listen. Outcome, optimization across descriptive and predictive. We need to do the most efficient thing where it needs to be done. There is not one blanket answered. There's no one blanket bucket, folks. I'm sorry. There's no miracle where you can just dump everything and the answers pop out like frogs out of a bucket, okay? It's not going to happen. You've got to work with your enterprise architects. You've got to engineer this stuff, all right? Anything on that one, Kelly, before we move on here to questions? You got it. I got it. All right. All right. Let's talk about it. Let's do some questions here. First question. I always want to do, like, this is from Fred in Cleveland, and I don't know where any of these people are from, so I can't say that. Anyway, this particular listener recently attended the MIT CDO conference. Much talk about data, data, data, data. Very little talk about the actual systems that produce the data. The data dust doesn't appear. It comes from systems. What gives? Well, you know what you're hearing there, and I've never been to that particular conference. I don't know what their emphasis is, but this is kind of what we talked about earlier, right? Kelly, this emphasis on technology and the tools, this goes back to the old days of the BI vendors saying, yes, we can get you these reports in 30 days. Just do this and slice and dice and all this. Then someone says, yeah, but how does the data get in there? We don't do that. We don't do that. No, it doesn't just appear, does it? Our reference architecture has sources on it. That's why. It's a lot of work to get it from the left side to the right side of that chart. I read something that someone said 30 to 40 percent of business analysts has just spent gathering, collecting the data, and I laughed at that because it's really like 70 or 80. To that person's question, Kelly, I don't know if you agree or not. If you don't, please expand or reinforce it. We're all going to look in the hard stuff a lot of times. Yeah, absolutely. All right. Let's see here. Someone said, this is some of the editorial comments which are worth sharing for the chart as to what to buy on a chart. Just throw a dart at it and buy wherever the dart lands. I don't know. There might be some merit to that. We would like to buy that. There's been several comments, Kelly. This is something to reflect on because you've been in this industry a while too within data intensive things. We're repeating ourselves right now. Big data has reached a point where we can step back and say, hire's what's working. Here's what's not working. We're repeating ourselves on a lot of this stuff. Absolutely. I think that it is a learning process and that's okay. I think that we are, but what we've learned in the past, we also need to start applying. I think one of the comments that was also made earlier is that these pictures and this discussion sounds a lot like data warehousing in the 90s. I think we do need to recognize that history can repeat itself and are we remembering that and learning from it and like I said, failing fast and moving on rather than continuing to spin our cycles and drive up costs, which are bad for organizations, but good for software companies. Exactly. Here's a caveat with that. There is a tendency to say, well, we said this all before, so this is the same. Be careful because that's not true. Remember, big data has gone mainstream. There's a rock group. There are commercials on television in prime time where people are talking to Watson. When I was coming up into the workforce, my view of that was Hal. Hal was not a nice computer in 2001 of Space Odyssey. So there is a real, there's a difference here, but now that said, when we say that these problems are the same, it's not that big data is the same as early instantiations of data warehouse. What is the same is the business still needs to have some help from data to solve problems. Business hasn't changed their requirement or their request for data assistance in 40 years. Simple, plain and simple. Since data warehouse started, we're now looking at almost 30 years of data warehouse. Before that, it was decision support. Before predictive analytics, we had data mining. It was creating answers to questions you haven't asked yet, and that was all over the advertisements. So what has not changed, and that is why these success factors keep coming up, is the technology has ignored. This is what we're telling you is the bottom line of our presentation here today. Don't ignore using business language and solving the business problem and addressing that. The technology is changing. We're seeing a lot of adjustment to where the demand and the market and the attention is, which is on users of information. But that's why it's not that this is the same. It's not that we're making the same mistakes. It's that we still haven't solved the problem that we promised to solve in a cruder function 30 years ago, but we are still talking about a lot of the same problems. Now, in addition to that, we're talking about a bunch of new stuff with internet of things and much more shorter half-life data and much lower latencies, etc. like that. We didn't have the web four years ago, but there's a lot of stuff that's the same here. I'm going to let Kelly wrap up with her comments, and then we're going to turn it back over to our host, Shannon, here. Kelly, anything to add? No, I think that's well said. I think that's well said. You've agreed with me with everything today. This is probably. We're going to have to do some more controversial. Shannon, I don't see any more questions coming in. There are some more editorial comments on this, but we kind of capped on those. Let's see. Hold on. Is there one more here? There's one. I don't know if I can answer this. Kelly, are there bigger data projects in the world than DBpedia and the emerging Wikipedia project? Do you consider DBpedia to be successful? What are we learning from Wikipedia organization that can be replicated, produced by smaller big data projects? Is that something you thought about? I could talk out loud. We don't have the time. Is that something that you've considered? I don't think that I'm the best person to say what's the biggest big data project in the world, and I could only imagine that it probably has something to do with governments. Well, there is that part too. On that bombshell, to quote somebody who no longer has a job, I will turn it back over. Well, I'm sorry. One more just came in. How do you envision defeating the catch-22 of workforce that does not have enough experience? That's a good question. I'll give my 15 seconds. Kelly, you can give your 15 seconds. When this has happened before, happened in the early 80s, happened in the mid 90s, you train them. You hire them, and you train them, and then you create a workforce where hopefully you keep a fifth of them. Any other 80% go work for somebody else, hopefully not your competitor. Kelly, what do you think about getting around the catch-22? Well, I think that it ultimately goes in cycles and kind of solves itself in the sense that in the technology world, the universities always have been playing catch-up, and the academic institutions are always playing catch-up. It's no different here. The good news is they are playing catch-up, and that they are ultimately getting to a state where they are training people to be very active once they get out of the educational institution. I had dinner last night with our intern that we brought on board, and she's getting a master's in analytics. One of the things that she was saying is that it's nice to do an internship because what you learn in the classroom is very conceptual, and it's not really clear how it can be applied. I thought that was really telling, right? So we are still learning as we are doing, and there's nothing wrong with that. It just means that we need to recognize that. And John, just like you said, as we train these highly fantastic, highly paid people, we need to think about retention plans or they will go work for our competitors. Absolutely, absolutely. I hope that helped whoever answered that. Well, they just said thank you. I do so love them. You're welcome. Thank you. And you're most welcome. Shannon, we're back September 1st, and we'll have an interview with the CDO yet to be determined. Shannon, we'll hand it back over to you for a wrap-up and move on. Thank you, John and Kelly, for another great presentation as always. Just a reminder, I will be sending out a follow-up email within two business days, by end of day and Monday for this webinar, with links to the slides, the recording of the session, and also we'll be selecting some winners to attend Enterprise Data Diversity September 19th through 22nd in Chicago, and you can meet John and Kelly in person. I'll also be there as well. I love seeing all the familiar people on the webinar as well. Thanks to everyone for participating and everything that we do and for all the great Q&A here. So, that's about it. I hope everyone has a great day. John and Kelly, again, thank you so much for the time, and we'll see you all next in September. Thank you all. Thank you.