 Okay, we're back. This is Dave Vellante. I'm with Wikibon.org. I'm with my co-host today, Paul Gillan. We're here at the MIT Information Quality Symposium. We're live all day. We broadcast yesterday, two-day event here. The Cube was invited in. We were thrilled because we cover a lot of big data topics and not enough is said about information quality and data quality and data governance. And there are a lot of adjacencies as well that we're learning about at this event. Peter Akin is here. He's the founding director of Data Blueprint and very active with the Virginia Commonwealth University. And welcome to the Cube. Good to see you. Absolutely. Let me hear. So you've got a little prop here. I noticed it's the case for the chief data officer. Paul and I were talking about the chief data officer role before this. And we've heard from Derek Strauss yesterday that most companies don't have a chief data officer. Should they? Yes. And the real key to this is that we've got sort of an interesting situation that we've allowed to happen over time. And Paul, you've witnessed an awful lot of this in your career. But when we think of what a chief officer is in an organization, your chief financial officer is focused on one thing and one thing only. And that's making sure that your fiscal assets of your organization are available to implement the strategy of the company. They don't do bookkeeping. Your chief medical officer is focused on making sure that medical practices in the organization are the highest care and, again, supporting the goals of the organization. They don't do surgery. Your chief risk officer is focused on risk aspects of what's going on in the organization. And they don't do software testing. Now what do we ask our CIOs to do? If you'll excuse me, I'll do this, right? The portfolio of what we ask our CIOs to do is enormous. And they are supremely talented and have gotten phenomenally good at what they do. We're talking email, we're talking backup and recovery. Now we're adding big data topics into this as well. And they're quite frankly juggling. And in order for them to do more with data, something else is going to have to give up. And as most of us know, in IT, we can't afford to let anything else lacks because that's hard enough as it is. So the long answer to your question is yes, most organizations need a single individual to focus 100% on data all of the time. So the corollary of that premise is, if you're going to be, we've talked about this earlier, if you're going to be a data-driven organization, somebody's got to have responsibility for that data strategy. But Paul brought up an excellent point. I want to take a contrary position here. Just for the sake of argument, one of the problems with creating C-level positions is that it means that other people don't have to worry about that job. So I've got a CFO, which means I don't need to worry about finance. We hear a lot of talk now about the chief customer officer, which I think is a harebrained idea because everybody should be focused on the needs of the customer, not just one person. Isn't that the same case with data? I mean, we have a chief information officer. Data is a subset of information. Shouldn't that role be part of the CIO office? So it's a great point, and it is one that we have to examine very carefully. In most organizations, though, when you look at what's actually happening there, there's a couple of deficits that occur within that existing structure of being a CIO. First of all, IT organizations are project-driven, as they should be. IT projects need to have a beginning, middle, and end. One of the things you'll notice about data is that it doesn't have a beginning, middle, and end. It's not a project, it persists. So trying to manage data with this type of a project orientation doesn't work. It hasn't worked so far, and it isn't going to work. Now, Dave, you mentioned earlier, I've got sort of an interesting background in that I'm a Tangier University professor. The university owns part of my consulting business. I'm also president of Daima International. Yeah, let's talk about that a little bit more. I should have given you an opportunity up front where you just jumped right into it. We would do that sometimes in the CUBE. So yeah, so why are you qualified to talk about this topic? Y'all gave me the camera, right? No, seriously, talk about your background because it's very interesting. So Daima International is the professional organization for data managers. And what we are trying to do is elevate the status of the profession. And one of the things I talk about in the book is a fair amount of research. But you can't have a book like this to a C-level person and say here, read some research that we've done. So this is actually the short letter that I had time to write, whereas most of the time it's the apologies, I don't have time to write to a short letter, so I've written you a long letter instead to do this. And our research shows that back 20 years ago, 25 years ago, the data people reported directly into the CIO. Now what we've seen is in the intervening 20 decades, we've pushed that function down in the organization and we've done it largely through the fault of college and university systems. For example, when you look at what people take as they're going through college and university, they get one course in data. And that course happens to exist in the IT department where they talk about building new databases. Now frankly, if there's a skill we don't need on the planet, it's building new databases. So it's pretty easy. And when we've done that, when smart people take these classes, they, you know, leaders, right? They're going, okay, so that's a construction function and it really belongs in the bowels of the organization. But data, when you look back at it from an abstract perspective, is our sole non-depletable, non-degrading, durable strategic asset. Now the durable part immediately causes people to pay attention, it circles back to your point earlier Paul. As a durable asset, we manage those things differently than we manage our normal assets, right? Your paper, your supplies, your things that go less than a year, you have a different management process than things that last a long time through the year. So being a durable asset, it does actually warrant additional time and attention. One of the stories I like to tell when I'm illustrating this is many people remember the Enron story. And Enron as it was devolving towards the end of its life, it got married to a company called Dinergy. And as part of that marriage, they came along and brought several billion dollars in cash to keep Enron afloat. They were a little cash short. Enron as a company had no cash controls anywhere in the organization. Any individual in the organization could write a check for any amount of money for any reason, right? Your eyebrows are going up and going, that is not good governance, right? I want to work at a company like that. Fine and handy, right? For the six months that it lasts. And now that we're 10, 12 years beyond this, most people are going, yeah, that was an example of bad corporate governance. Well, if we're going to govern our data assets with the same care and precision and professionalism that we do, our fiscal assets, our risk aspects of the organization, our medical practices in the organization, just to use those three examples, then we need a class of professionals who know how to do this. Now you'll also hear another piece of this too. And that is that you'll hear the term business rules. You know why it's a business rule? Because we're over on IT side and we have to go over to the business people and ask them questions. So another piece I make in the book is that this C-level position should report to the business as opposed to IT. Okay, well that's interesting. That plays into my next question. Because I hear a lot of frustration at this conference with the inability to connect with the business side, to get the business people interested in data quality as a core discipline. Is that because? And is IT, is the problem really that IT is too invested with responsibility for data quality? And so it becomes something that others just don't have to worry about. It's one of these things, right? You know, you always hear this data quality is everybody's responsibility. That's everybody else's responsibility. My data's good. Or I'm getting by. But the minute it says, Peter, your next raise is going to be dependent on taking a specific aspect of data and improving its quality in a measurable fashion, I'm gonna start studying that piece. I'm gonna look at it very carefully and say what's going on. But we also have this belief, again, that data is being taken care of somewhere because we have this title, Chief Information Officer. Must be okay, right? Chief Financial Officer, I don't have to worry about that anymore. So what we've got to really recognize is that most CIOs are doing the best they can with the very large palette that they have. And again, they're doing a phenomenal job in terms of what they're able to juggle. But if we ask them to do more with data, and that's what big data is really all about, something's gonna have to give when most organizations don't have that flexibility. So let's move that function over to the business, let's make it report directly to the board at the same level the other asset managers report into, whatever that is in the organization. And let's pay some attention to it. Well, and you talked, Peter, about how the university sort of perpetuated the problem with the way in which they train students. But as well, organizations were fine saying, okay, the data governance piece, we'll push that down because the data governance is in this God box here and it's all about the financial systems and the like. And that's a pretty narrow part of the corporation's overall data. And then all of a sudden, this big data theme, and everybody rolls their eyes at it, but it's actually real in the sense that lines of business are saying, hey, I can actually spend money and hire data scientists and put stuff up on Amazon and go do stuff, and it's gonna create more data than exists in that God box. Now all of a sudden, CEO says, who's data do I believe? And now that problem of single version of the truth just gets multiplied by 100 or more. Absolutely. Okay, so how does that play into the case for the chief data officer? Will the chief data officer be that sort of centralized function or will it sort of exist in some kind of virtual environment? Good question. If we think about what our chief officers do, they're focused on one thing and one thing only. I'm arguing for an overcorrection because of inattention in the past. It's entirely possible, for example, that this chief data officer function might become something like our chief electrification officer. No, you think about this for a while and say, okay, we did have chief electrification officers, 1890 to 1910, and in some parts of the world, there are still electrification officers. Do you think a knee-jerk reaction is appropriate at this time? I think an overcorrection is what I'd like to do is that term, but let's just put some serious attention. In terms of making this real, we hear talk about risk, about defining this as a risk problem which the business can understand, or as an opportunity problem, or as an opportunity, I should say, an opportunity to sell more to the customer, to customize the experience, or to build the business. Which message do you see resonating better? Unfortunately, the message that's resonating better is the innovation component. Now, I'll go back to Michael Porter's strategy pieces. And again, the fun part about what I get to do is I do have a lot of measurements in these areas. So I can tell you guys that only one in 10 organizations even has a board-approved data strategy. Now, there's obviously a strategy, again, around medical risk. There's obviously a strategy around financial. There's obviously a strategy around HR, right? If our work force is 55, we got to go out and hire some young people, right? So these are things that they do. If they don't have a strategy, then it doesn't matter what we're doing from the other part of it because any road will get you there. Well, that's an interesting stat. I would think most organizations say, well, what is a data strategy? I mean, that's really, why do we need one? Well, I would assume a lot of those organizations are in regulated industries where they don't really have any choice. Right. Is that fair? Fair, at the same time, though, if we think about strategy from a basics perspective, Michael Porter went back, again, crystallized this for a lot of us, and there's two dimensions of strategy. You look this way, it's innovation. You look this way, it's looking at efficiency and effectiveness. But these are orthogonal, and so most organizations are trying to do both at the same time. So what we're advocating is a crawl-walk-run strategy of first practicing what you're doing and if I'll take it one step further, most everybody remembers Maslow's hierarchy of needs, that wonderful pyramid that comes in. Well, data's kind of like that in that if we put in the top part of that pyramid, MDM, mining, all the big data concepts, whatever it is, it's the latest silver bullet of the week. And again, over the 30 years we've been in this business, we've seen a lot of silver bullets come and go, right? That's great. But if you don't have a good foundation on which to build those things, you can get there, but it'll take you longer, cost more, deliver less at greater risk to the organization than if you instead adopt this crawl-walk-run approach, which is first of all getting good at what you're doing at data management, and then trying to go to the really innovative pieces. So I always advocate to organizations, look to your expense reduction area, even though it sounds hard in today's environment, it's not a bad place to go. If you can turn around and say, I can save you 10 million on the bottom line just for the next three months, people pay attention to those numbers very quickly. Go ahead. We're seeing a lot of interest in cloud computing today as a way of moving apps, spinning up apps more quickly and making the business more nimble. And when you move to a cloud service, you essentially are moving, you're either migrating database or you're building a new one. Is there opportunity during that process to fix the quality problem? Boo, and that's a huge opportunity. And again, we see most organizations aren't taking advantage of it. So data in the cloud has three characteristics that it should possess that data outside the cloud does not possess. First one is that it should be of higher quality. That's easy to justify by saying, why would you put data of lower quality in the cloud? But if you don't know, you need to measure it as it's going in. Second part is that it should be less in volume. And again, I'll give you another stat, David, that you might like. 80% of organizational data falls into the category of rot. It's redundant, obsolete, or trivial. Craplications, we call them. If it is, why would you put any of that stuff out in the cloud? It should be less in volume. If it's not less in volume, again, just because the cloud is cheaper, you're just putting it out there. Now you're making it harder to get because you sort of have to use those glove boxes and get in it. And by the way, you've got a little pipe and you're pushing a lot of data through too as well. So let's be careful about what we do. So that Paul articulation there is that is an opportunity for organizations to do it. The third characteristic is the data in the cloud must be by definition more shareable than data outside the cloud. If it's not more shareable, why are you putting it in there at all? Well, is this a cloud service? I mean, is this something that cloud providers can offer as a service? We're going to help you clean up your data as well as provide the application. It can be, but where are they going to make more money? To give the application. Well, they're going to sell you more cloud service, which is what they're, again, think about it. We don't do many things well at the same time. Yeah, lower quality, high volume is better for the cloud service guys, is what you're saying. Because they're charging, because that's how the billing model is. Bingo, bingo. Now, again, it's not that they're mal-incentives. They're not bad guys. But just they sell this service. This is what they're trying to do. So it's, of course, in their interest to do what they do well. They're Pavlob's dog. Okay, so where do you see, so you said one in 10 organizations has a board-approved data strategy. Correct. Is the percent of organizations that have a CDO actually follow that to track or is it even less? Less. We just finished a survey at Data Blueprint on this, a new section of it. But our booth up there, and we'll have it online, the diversity soon, yeah. Thanks, Megan. It'll be, get a hold of it. What we're seeing, again, is that 70% of the CDOs were hired within the last 12 months. And of those, we see some very interesting challenges in that half of them have no budget, and half of them have no staff. Which in organizations is a, it's a short-term job, yeah. Okay, so that's good. This is a great segment we could go on for. We're out of time, but I wanted to talk very briefly about the Data Blueprint model. So that is a consultancy that is a joint venture, if you will, with the university. They and I co-in the organization. Awesome, I love that. So how did that come about? Well, the universities, we're here at MIT, right? Commercializing the property, the intellectual property that we do with the faculty and trying to push this out into the community so the university are seen as not just great centers of research and learning, but also useful to the immediate community. So it does put me in a very nice position of being able to work with it. I'll mention this. Our new dean at the School of Business at Virginia Commonwealth University is the former CEO of Disneyland. So that was an interesting move for us. That's great. All right, Peter Aiken, thanks very much for coming on theCUBE. It was a pleasure. Thank you. Great stuff. Awesome data. Really appreciate it. Paul Gillan and I, we'll be right back with our next guest. This is theCUBE. Hi, from MIT, Information Quality Symposium.