 From Cambridge, Massachusetts, it's theCUBE, covering MIT Chief Data Officer and Information Quality Symposium 2019, brought to you by SiliconANGLE Media. Welcome back to Cambridge, everybody. We're here at the Tang Building at MIT for the MIT CDOIQ Conference. This is the 13th annual MIT CDOIQ. It started as an information quality conference, grew through the big data era. The Chief Data Officer emerged and now it's sort of a combination of those roles, that governance role, the Chief Data Officer role, critical organizations for quality and data initiatives, leading digital transformations and the like. I'm Dave Vellante with my co-host Paul Gillan. You're watching theCUBE, the leader in tech coverage and Mark Crisco is here as the Deputy, sorry, Principal Deputy Director for Enterprise Information at the Department of Defense. Good to see you again. Thanks for coming on. Thank you for having me. So, Principal Deputy Director, Enterprise Information, what do you do? I do data. I do acquisition data. I'm the person in charge of lining the acquisition data for the programs for the Undersecretary and the components. So, a strong partnership with the Army, Navy and Air Force to enable the department and the services to execute their programs better more efficiently and be efficient in the data management. What is acquisition data? So, acquisition data generally can be considered best in the shorthand of cost-schedule performance data. When a program is born, you have to manage, you have to be sure it's resource, you're reporting them to Congress, you need to be sure you have insight into the programs, and finally, sometimes you have to make decisions on those programs. So, cost-schedule performance is a good shorthand for it. So, kind of the key metrics and performance metrics around those initiatives. And how much of that is how you present the data, the visualization of that? Is that part of your role or is that sort of another part of the organization you partner with? Well, if you think about it, the visualization could take many forms beyond that. So, a good part of the role is finding the authoritative trusted source of that data, making sure it's accurate so we don't spend time disagreeing on different data sets on cost-schedule performance. The major programs are tremendously complex and large and involve an awful lot of data in a buildup to a point where you can look at that. It's just not about visualizing, it's about having governed authoritative data that is frankly trustworthy that you can go operate at. What are some of the challenges of getting good quality data? Well, I think part of the challenge was having a common lexicon across the department and the services. And as I said, the partnership with the services have been key in helping define and creating a semantic data model for the department that we can use. So we can have agreement on what it would mean when we were using it and collecting it. The services have thrown all in and in their perspective have extended that data model down through their components to their programs so they can better manage the programs because the programs are executed at a service level, not at an OSD level. Can you make that real? I mean, as an example, you can give us what you mean by a common semantic model. So for cost schedule, let's take a very simple one, program identification. Having a key number for that, having a long name, a short name, and having just the general description of that were in various states amongst the systems. We've had decades where however the system was configured configured it the way they wanted to and it was largely not governing and then trying to bring those datasets together were just impossible to do. So even with just program identification and since the majority of the programs and numbers are executed at a service level, we worked really hard to get the common words and meanings across all the programs. So it's a governance exercise. Yes, it is certainly a governance exercise. I think about it as not so much as in the IT world or the data world, we'll call it governance. It's a leadership. Let's settle on some common semantics here that we can all live with and go forward and do that. Because clearly there's needs for other pieces of data that we may or may not have, but establishing a core set of common meanings across the department has proven very valuable. What are some of the key data challenges that the DOD faces and how is your role in helping address them? Well, in our case, and I'm certain there's a myriad of data choices across the department. In our place, it was clarity and the governance of this. Many of the pieces of data were required by statute, law, policy, or regulation. We came out of areas where data was the piece of a report and not really considered data. And we had to lead our ways to beyond the report to saying, no, we're really talking about key data management. So we've been at this for a few years and working with the services, that has been a challenge. And I think we're at the part where we've established the common semantics for the department to go forward with that. And one of the challenges, and I think, is the access and dissemination of knowing what you can share and when you can share it. Because Michael Conlon said earlier that data in mosaic, sometimes you really need to worry about it from our perspective. Are we, is too much publicly available or should we protect it on the behalf of the government? And then that's a challenge. And is there a challenge in terms of, I'm sure there is, but I wonder if you could describe it or maybe talk about how you might have solved it. Or maybe it's not a big deal, but you got to serve the mission of the organization. Absolutely. That's number one. But at the same time, you've got stakeholders and they're powerful politicians and they have needs and there's transparency requirements. There are laws. Are those, they're not always aligned, those two directives, are they? Well, thank goodness I don't have to deal with misalignments of those. We tried to speak in the truth of here's the data. The decisions across the organization of our reports still go to Congress. They go to Congress on an annual basis through the Selected Acquisition Report. And we are better understanding what we need to protect and how to advise Congress on what should be protected and why. I would not say that's an easy proposition. The demands for those data come from the GAO, come from Congress, come from the Inspector General. And having to navigate that requires good access and dissemination controls and knowing why. We've sponsored some research through the RAND organization to help us look and understand why you have got to protect it and what policies, rules, and regulations are. And all those reports have been public so we could be sure that people would understand what it is. We're coming out of an era where data was not considered as it is today, where reports were easily stamped with a little rubber stamp, but data now moves at the velocities of milliseconds, not as the velocity of reports. So we really took a comprehensive look at how do you manage data in a world where it is data and it is on infrastructures like data models. So the future of war, everybody talks about cyber as the future of war, there's a lot of data associated with that. How does that change what you guys do? Does it? Well, I think from an acquisition perspective you would think, you know, it is in that discussion that you just presented us, we're micro in that. We're equipping and acquiring through acquisitions. What we've done is we make sure that our data is shareable. You know, open API structures. Having our data models, letting the war fighters have our data so they could better understand where information is here. Letting other communities to better help that. By us doing our jobs where we sit, we can contribute to their missions and we've always been very sharing in that. Is technology evolving to the point where, let's assume you could dial back 10 or 15 years and you had the nirvana of data quality. Is technology, we know how fast technology is changing, but is it changing as an enabler to really leverage that quality of data in ways that you might not have even envisioned 10 or 15 years ago? Well, I think technology is. I think a lot of this is not in tools. It's now in technique and management practices. I think many of us find ourselves rethinking of how to do this now that you have data, now that you have tools that you can get them. How can you adopt better and faster? That requires a cultural change organization. In some cases it requires more advanced skills. In other cases it requires you to think differently about the problems. I always like to consider that we at some point thought about it as a process-driven organization, step one to step two to step three. Now, process is ubiquitous because data becomes ubiquitous and you can refactor your processes and decisions much more efficiently and effectively. What are some of the information quality problems you have to wrestle with? Well, in our case, by setting a definite semantic meaning, we kick the quality problems to those who provide the authoritative data. And if they had a quality problem, we said, here's your data, we're going to now use it. So it spurs, it changes the model of them of ensuring the quality of those who own the data. And by working with the services, they've worked down through their data issues and have used us a bit as the foil for cleaning up their data errors that they have from different inputs. And I like to think about it as flipping the model of saying, it's not my job to drive quality, it's my job to drive clarity, it's their job to drive the quality into the system. Let's talk about this event. So you guys are longtime contributors to the event. Mark, have you been here since the beginning or close to it? About halfway through, I think. When the focus was primarily on information quality. So is it CDO IQ at the time, or was it IQ? It was the very beginnings of CDO IQ. It was right before it became CDI IQ. It's an early part of this decade. Yes, yes. Okay. It was the information quality symposium. Yes. Originally, is that what attracted you to it? Well, yes, I was interested in it because I think there were two things that drew my interest. One, a colleague had told me about it and we were just starting in the data journey at that point. And it was talking about information quality and it was out of a business school and the MIT Sloan side of the house. And coming from a business perspective, it was not just the providence of IT. I wanted to learn from others because I sit on the business side of the equation, not a pure ITist or a technology. And I came here to learn. And I've never stopped learning through my entire journey here. What have you learned this week? Well, there's an awful lot I learned. I think it's been, this space is evolving so rapidly with the law, policy and regulation, establishing the CDOs, establishing the roles, getting to hear from the CDOs, getting to hear from visions, to hear from Michael Conlon and hear from others in the federal agencies, having them up here and being able to collaborate and talk to them. Also hearing from the technology people, the people that are bringing solutions to the table. And then I always say this is a bit like group therapy here because many of us have similar problems. We have different start and end points and learning from each other has proven to be very valuable. From the hallway conversations to hearing somebody and seeing how they thought about the product, seeing how commercial industry has implemented data management. And you have a lot of similarity of focus of people dealing with trying to bring data, to bring value to the organizations and understanding their transformations. It's proven invaluable. What did the appointment of the DOD's first CDO last year, what statement did that make to the organization? That that is important, data are important. And having a CDO in that and when Michael came on board we shared some lessons learned and we were thinking about how to do that. As I said, I function in a arguably a silo of the institution and the acquisition data, but we were copying CDO homework. So it helped in my mind that we can go across to somebody else that would understand and could understand what we're trying to do and help us. And I think it becomes the CDO community has always been very sharing and collaborative. And I hold that true with Michael today. It's kind of the ethos of this event. And then I mean, obviously you guys have been heavily involved with we've always been thrilled to cover this. I think we started in 2013 and we've seen it grow. It's kind of fire marshal full now. We got to get to a new facility, I understand fire marshal full next year. So that's congratulations to all the success. Yeah, I think that's, I think it is an important and we've now seen, you hear it, you can read it in every newspaper, every channel out there that data are important. And what's more important than the factor of governance and the factor of bringing safety and security to the nation. I do feel like a lot in certainly a commercial world, I don't know if it applies in the government, but a lot of these AI projects are moving really fast. And especially in Silicon Valley, there's this move fast and break things mentality. And I think that's part of why you're seeing some of these big tech companies struggle. Because they're moving fast and they're breaking things without this sort of governance injected. And many CDOs are not heavily involved in some of these skunk works projects. And it's almost like they're bolting on governance, which has never been a great formula for success in areas like governance and compliance and security. You know, the philosophy of designing it in has tangible benefits. I wonder if you could comment on that. Yeah, I could talk about it as we think about it in our space and it may be limited. AI is a bit high on the hype curve as you would imagine right now. And the question would be is can it solve a problem that you have? Well, you just can't buy a piece of software or methodology and have it solve a problem. If you don't know what problem you're trying to solve and you wouldn't understand the answer when it gave it to you. And I think we have to raise our data intellectualism across the organization to better work with these products because they certainly represent utility. But it's not like you give it with no fences on either side or you open up your aperture to find basic solutions on this. How you move forward with it is your workforce has got to be in tune with that. You have to understand some of the data, at least the basics. And particularly with products when you get to machine learning, AI, deep learning, the models are going to be moving so fast that you have to intellectually understand them because you'll never be able to go all the way back and stubby pencil back to an answer. And if you don't have the skills and the math and the understanding of how these things are put together, you may or it may not bring the value that they can bring to us. Mark, thanks very much for coming on theCUBE. Thank you very much. It's been great to see you again and appreciate all the work you guys both do for the community. All right, and thank you for watching. We'll be right back with our next guest right after this short break. You're watching theCUBE from MIT, CDO, IQ.