 Okay, we're back. This is Dave Vellante and Jeff Kelly. We're with Wikibon. This is theCUBE's Silicon Angles production. We're here at the MIT Information Quality Symposium. This is day one for us. We'll be here all day tomorrow. And going wall-to-wall coverage. Those in our audience who follow theCUBE, you know how we go to these events. We bring the best guests that are at the event to extract the signal from the noise. And Jeff, the interesting thing about this event, we have been covering big data for several years now. And the interesting thing here is we've seen a couple of themes emerge in big data in the last couple of months that we haven't really been hearing. One is security, we had Eli Kahn on from Squirrel talking about that. And obviously had an information quality angle. But the other is this notion of data quality information quality. Now we've certainly heard for years about the single version of the truth and the data warehousing business and the like. But now with the big data meme, I think people are, there's a heightened sensitivity to data quality. And it seems to me, Jeff, my observations are there's a time and a place for data quality. You know what we've been hearing today is look, if it's Twitter data and you're trying to look at sentiment and you get a false positive, okay, that's not an issue. But we're hearing from healthcare practitioners, financial service practitioners, even sectors of the government we've heard from the VA. Major challenges in terms of disparate systems, siloed systems that they're trying to bring together and communicate with each other. And another consistent theme that we're hearing is it's always people process and technology, the technology is the easy part. It's the people and the process. Yeah, but that sounds good, but when you really start to dig into how these practitioners have resolved these problems, it really is a lot of heavy lifting and a lot of hard work. So what were your takeaways from today? Well, a few things. So one, on the point about data quality and a big data context, you're right, it's less about having a perfect data temple with all your data of 100% high quality and it's more about the right level of data quality for the right application. So that's one thing I took away from today. Another was the role that chief data officer is one that's still kind of evolving and emerging, but I think one thing I really took away from today was it's not really, the focus really isn't just the data, it's about the business and it's about listening. There's one skill I took away, or one job that I heard from each of the chief data officers we talked to today that they really focus on is listening to the business. When they start, when they join their organizations, Derek Strauss, TD Ameritrade, we just spoke to, the first thing he did for the first 100 plus days was go out to the business and listen to their problems, their challenges, so that he can understand where to focus his efforts. So it's really about listening, communicating with the business as much as it is about implementing technology to actually apply data quality to your data assets. So those are two of the big things I took away and I think it's all about, again, having the right data in the right place at the right time and less about having a pristine data temple. That's really one of the keys that I took away today. Well and then back in the day, the vision was always, okay, I have this data temple, I'm going to go out and buy the biggest UNIX box I can find, I'm going to shove a bunch of data in there, it's all going to be standard and I'm going to let the certain people access this, I'm going to set up processes, I'm going to build processes around that data temple and I'm going to build a single version of the truth. Well that sounded good but it didn't work actually because you ended up with what the VA had. What that trend described today is very similar to a lot of organizations. We've got hundreds of systems, many, many databases, they're all siloed, there's processes built up around each of those and they don't talk to each other. There's not a 360 degree view of the customer, we're not sharing customer data, there's a lot of data entry going on and of course that means a lot of potential for error. I don't think at all the big data initiative solves that problem, in fact, if anything it makes it worse. What I see happening here in my observation is that you're seeing a lot of, as I said before, a lot of heavy lifting and practitioners are really trying to start understanding this problem, they're getting to the root cause, they're putting together data architectures and I think they're dealing with this big data as an opportunity, potentially to use analytics to resolve some of the data inconsistencies but also as a way to drive business value in ways that perhaps they couldn't with traditional methods. And so I think you're seeing that emerge and we talked today about the old, we've had this discussion before, Jeff, the tail wagging the dog and the balance of value and I think that there will be a natural equilibrium reached between this new emerging, unstructured, new VOData as we heard today with the traditional data. So you're not going to be here tomorrow, actually Paul Gillan is going to be doing some of the co-hosting with us so really excited to have Paul on. Paul is really the reason why we're here and somebody who has collaborated with us in the past and has seen the evolution of the Wikibon and SiliconANGLE community, so we're going to miss you tomorrow but... Well I wish I could be here. So what are the things that we should be looking for tomorrow that you want us to pay attention to? Well I think continue exploring this role of the Chief Data Officer, fascinated by how this is evolving and actually in practical ways inside the organization. Where does the Chief Data Officer sit in the organization? How does he or she effectively communicate with the rest of the business and disarm some of those parts of the organization that are a little wary of that, this new role? Some other things I think, I always love talking to practitioners who are focused in different verticals and understanding really the implications, the real world implications, not the theoretical that we talk about sometimes but the real world implications of bringing in new types of data, the pressures on Chief Data Officers and other data professionals in financial services, in healthcare, in government, to really deliver analytics to end users who increasingly in this kind of consumerization of IT world they want analytics and data at their fingertips and they want it now. So there's this increasing pressure on data professionals to deliver solutions that their end users are looking for to do it quickly. So always interested how people are tackling that problem and then of course, as we mentioned earlier today, these are some industries that are also under heavy regulatory compliance pressures. So how that impacts what Chief Data Officers, data scientists, other data professionals are able to do and how they navigate those waters because those are some serious consequences if you run a file of compliance regulations. Do you agree with Derek? Do you, I mean, certainly he wants to see more Chief Data Officers, that was a good interview. Do you agree that there will be? Do you think this is a natural progression? I mean, what he described to TD Ameritrade, to me was unique that these organizations that he cherry-picked really high quality people and formed this new organization under a CDO in many organizations that would be like internacing battles to actually have that occur. And it seemed to be fairly frictionless at TD Ameritrade. Now maybe that's because there was a business case behind it, you had, you know, COO was behind it, but what's your take on that? Well, I think part of it is exactly that executive sponsorship, but, you know, I think we're going to increasingly see this because I think it was Derek who said, you know, in years past, the focus was on systems and apps and data didn't get as much of a consideration and that's clearly changing. So, you know, data spans silos, right? It can apply to any number of parts of your organization. And there are a number of unique requirements and considerations when you're talking about data in the enterprise that aren't necessarily directly related or not coupled with the applications and systems that are really running the day-to-day business process. So really, I think you're going to increasingly see achieve data officers or people with maybe not exactly that title, but taking on some of those responsibilities of treating data as an asset inside the organization and finding the best ways to deliver data and turn it into value for the various business units to ultimately, you know, drive profits and other cases to, you know, save money and become more efficient. You know, it's going to happen in some more data intensive industries sooner than in, you know, other industries where data is not at least now considered critical. But, you know, as we move forward, as we talked about on theCUBE and in our research, there really isn't a vertical market that's not going to be impacted by data in the next five, 10, 20 years. I mean, even something like agriculture, which you might not think of as a data intensive, industry is going to be impacted. So, I mean, you name an industry, we haven't come across a vertical market that's not going to be impacted to some degree. So increasingly, absolutely, I think you're going to see, if not, you know, CDO by name, there are going to be increasingly roles within various enterprises that their job is going to be to manage and really make better use of data. Well, we hear so much about data-driven organizations, and there's a spectrum, as you well know. Many organizations saying, ah, we're very metrics-driven, we're data-driven, but when you really peel the onion, they're not so much, you know, data-driven. There are many organizations, as you know, Jeff, are, you know, historical pattern-driven, historical process-driven, or, you know, very customer-driven, for example. And so, that's not necessarily a bad thing. So, it's nice to say, okay, we're a data-driven organization, but actually becoming a data-centric organization is not an easy transition to make, and maybe not always the right transition to make, but for many industries, as you point out, it will be, the question I have for you is a natural progression of a data-driven organization, a data-quality-driven organization. What's your take on that? Well, I think, when you understand that, any type of, if you wanna use data to drive your business, to find new lines of business, to find new lines of profit, revenue, et cetera, the only way you're gonna do that is with good data. I mean, you can analyze data that's of questionable quality all you want, but it's not gonna give you insights, at least not for long, that are gonna really allow you to improve your business in the long term. So, it really wouldn't make sense to try to make the transition to a data-driven organization if you're not gonna focus on having good quality data. It's just, it's the foundation, really, of any data-driven organization. I don't know, I don't know if I agree with that, because here's why, I think it depends on what industry you're in. So, as an example, I mean, I think if you're in financial services, healthcare, the ones we talked about today, no doubt, but you look at the web scale guys, they're going after value, I mean, it's to quality second, I mean, look at Google, right? I mean, Google, over time, has looked at data quality as a process, and it's certainly improving its data quality, but it's made a lot of money with imperfect data, ad-serving, has been relatively, it's been terribly imperfect for the last decade, and it's all starting to get better. We've had guys like Aerospike on, talking in TapAd, talking about how they're h-streaming, how they're improving that whole system, but there's been a lot of money as exchanged hands, a lot of value created with imperfect data. Now, eventually it's a natural evolution that that's going to get better and better and better as a natural cycle, but I'm not so sure that there's not a lot of low-hanging fruit for imperfect data. Well, right, but what I'm not suggesting is that you stop your business and wait until you've got perfect data before you start driving data initiatives. It's got to occur simultaneously, and it's got to be more of an iterative approach. Well, we heard today from that trend it's not, data quality is not a project, and I couldn't agree more with that. So that's to your point. Right, so constantly improving your data quality is not at odds with the big data approach of the web-scale companies, for example. I mean, certainly there's plenty of low-hanging fruit, but as you just said, even the Googles of the world, the Facebooks of the world are working to improve the quality of their data as they evolve their operations. So what I'm not saying is, hold the presses, stop your business until you've got perfect data. That's not the way to go. It's much more of an iterative approach, but if the quality of the data is not a core aspect of what you're thinking about as you're moving forward, you could quickly find yourself in a position where you've invested a lot of money and time in people and technologies to manipulate and analyze data, but to no avail when the underlying data, the raw material, just isn't accurate. So they've got to evolve kind of side by side, but you simply can't ignore data quality and of course it does depend on the use case as well. If you're serving ads online, that's one use case and there's a certain threshold for data quality you want to reach. If you're prescribing medications to critical ill patients, there's a different threshold. So how you balance those will depend by industry, by work, by use case, but I don't think there's, even in the large web scale companies, I think data quality still has got to be part of their overall initiatives. All right, Jeff Kelly, well listen, I really appreciate you hanging with us today and all the preparation for this event. I really appreciate the invite from the folks at MIT and the collaboration with Paul Gillin who will be co-hosting with me for tomorrow and look forward to that. So keep it right here tomorrow. We start at 10.30 Eastern time and we will open it up, have our first guest on around 10.40 and go most of the day. I think we end mid-afternoon tomorrow. You can tweet me, I'm at D. Volante. You can tweet at the cube, our new Twitter handle. So appreciate all the feedback and the tweets and the support. Use the hashtag, the hashtag for this event is poundmitiq. Go to siliconangle.com, check out all the blogs associated with the videos here today. Go to youtube.com slash siliconangle and we'll have a playlist up shortly. Most of these videos will be up by this evening and also go to wikibond.org, check out all the research around data, data quality, big data, check out wikibond.org slash blog and also look for what we call the shock and awe page where we take all the blogs that have been written about this event, all the videos, we aggregate them into a page. It'll be called something of the effect of MIT information quality. It may already be up. I checked earlier, it wasn't up yet, but it will be up by tomorrow. So we'll let you know what that is. Really appreciate you watching, you're tweeting and we'll see you tomorrow, everybody. This is Dave Volante with Jeff Kelly. Have a good night.