 From the campus of MIT in Cambridge, Massachusetts, it's theCUBE, covering the MIT Chief Data Officer and the Information Quality Symposium. Now, here are your hosts, Stu Miniman and Paul Gillan. Paul Gillan and Stu Miniman back joining you from the MIT Chief Data Officer and Information Quality Symposium in Cambridge. We are here for two full days of streaming interviews. This is theCUBE, the SiliconANGLE live video platform and we're here talking about Chief Data Officers. We hear a lot of talk about Chief Data Officers at this conference and we've got one right here in front of us, Douglas, excuse me, Dan Morgan, who is the Chief Data Officer of the U.S. Department of Transportation. Now that's got to be an interesting job or an interesting agency. Dan, how did you become a CDO? I think my journey was a little bit different. I think everybody comes to the job a little bit differently. We don't train CDOs quite yet. I started out as a management consultant doing supply chain work and so logistics kind of taught me the importance of data and analytics and truth in our data for me to be able to do my job and help my customers save money and serve their customers most efficiently. I moved into IT systems management and architectures and when the opportunity presented itself to compete for the job of Chief Data Officer at the U.S. Department of Transportation, I threw my hat in the ring and I was fortunate enough to be selected and start this, I'm the first Chief Data Officer at the U.S. Department of Transportation and the first Chief Data Officer at a cabinet agency which is kind of a unique scope. So it's an opportunity to really shape what this job could be about in the federal government. We've been talking a lot about that earlier on. We had the two of the founders of the Institute of Chief Data Officers talking about the sort of fuzzy description of what CDOs do. It seems to vary by company, by industry. What is your job? I break it down into three simple terms. Improve our data governance which is really about getting the right people around the table to make decisions about what priorities we need to focus on to make our data more useful and support the agency. I can talk a little bit about what kind of data we have in a little bit. Then engagement with users. So President Obama has really made open data a priority for the federal government. It's not like we haven't been releasing data in the past but it's become increasingly important because data is so fundamental to accomplishing a lot of our agency missions. And so engagement with our data users also helps us understand where we need to focus our priorities in terms of where we need to make technology improvements and quality improvements. The last piece is around technology enablement and bringing modern 21st century tools into the agency so that we can get the most value from our data for the staff that we currently have. Dan, I'm wondering, talk a little bit about kind of the openness of data and security. I mean, you've run federal transportation, something that's pretty critical, want to make sure it's protected. So how do you balance that? How do you approach that? It's a team sport. I guess the first thing. It's not a tension either. One of the things that the open data policy really did for us was move the presumption to one of openness. You start with, this is probably releasable and work back for the reasons why it's not. And that might change what you release. So the statistics community, we've got statistical disclosure limitation and other kinds of advanced privacy protecting techniques that still allow us to get statistics out. That's how the census gets data for us to be able to know where people live and how many people live in individual geographic areas, which is kind of important for when we wanna build transportation in the first place. And then there's this notion of working with the chief information officer and the chief information security officer to make sure that our systems have the right controls in place, both from an access management and then from a fundamental technological controls perspective to secure the data. The U.S. Department of Transportation, I mean there are a lot of DOTs around at the state level and even the city level. Where does your responsibility begin and what are the key kinds of data that you gather on an ongoing basis? Sure, we're a mishmash of an agency. So we've got the Federal Aviation Administration. So if you've ever tracked a plane for when your loved ones are going somewhere, that's our data and you can track it on all. You can flight tracker, whatever it is. Flight aware, flight stats, you name it. That's our data being reused by lots of different folks. We also do all of the charts and runways that the pilots use to navigate the airspace system. We have a lot of regulatory compliance data. So when you drive down the road and you see inspectors pulling trucks over and doing inspections, that's some federal workers and a lot of state workers. And all of those data flow to the department for us to understand which trucking companies might be more risky than others so that we can conduct further inspections at their headquarters. We have all of the data about vehicle recalls, which was a hot topic in the news. Child safety seats as well and other vehicle equipment are also important. So we have a consumer protection mission. We have a public health mission. So we collect the census of fatal crashes that comes, that's data that begins with a law enforcement officer responding on the street and moves up through the layers of government, the city, the county, the state and ultimately to the federal government. And we use that data to understand which kinds of vehicle technologies would help reduce the number of fatal crashes or which kinds of roadway improvements might be most effective in preventing those crashes from happening in the first place. We've got data, you may have heard the president and the state of the union talk about how many bridges were eligible for Medicare. That's our data. We have an annual health report on all 700,000 bridges in the United States that comes to us. We have information on pavement condition that comes to us annually. There's a lot of sort of slower moving data. We get it monthly or quarterly or annually. But then there's the real time operation stuff that the FAA does. And for folks who don't know, the St. Lawrence Seaway as well. Just throw that in for good measure. Yes. Thanks. Gush, talk about transportation. One of those things, budgets have to be challenging. Is data a tool to help you save money? Is it something, does the government ever think about monetizing the data that it has there? How does that fit into it? Generally, we're trying to make the data open and freely available. Some state and local agencies have thought about monetizing the data. In transit, for instance, you could get transit directions from Google Maps and Apple and all those other kinds of mapping companies. And transit agencies were thinking for a minute, maybe we could take that scheduled data and make it something that we could make money off of. But they realized that the ability to attract a customer and give them some certainty about when the bus was gonna arrive would give them a better return than trying to resell that data. And so, fare box recovery was more important than trying to make money off of that data. And I think that's sort of the key is understanding there's these ancillary applications that can help communities more effectively advocate for where transportation investments need to occur. They can highlight where transportation may or may not be working for them. Industries can better understand where transportation issues might occur so they can site their locations more effectively and create jobs in those communities and understand how they'll be able to serve their customers by getting goods to market. There's, I've even heard of people mining on the roadside safety inspections to more effectively market their windshield repair services. So they were looking for windshield repair violations and figuring out where they should put their billboards, which is kind of neat. And of course, we want windshields to be repaired because that's safe. So these are all like interesting sort of novel ways to put the data to work. And I think the economic value of those kinds of activities outweighs some sort of cost recovery. I suppose the taxpayers have already paid for the data in the first place. Exactly. The, it strikes me that internet of things must be a huge player opportunity for you. And as you start bringing in, I'm sure you're already bringing in a great deal of data from sensors, what are you doing about standardizing and cleansing that data, making sure the quality is good? This is one of those where it's like the network of governments from the state all the way up to the federal government and even down to the city sort of working together. We just conducted what we called the smart city challenge and folks can read about it at www.transportation.gov slash smart city. Columbus, Ohio was selected as the winner, but we outlined 12 vision elements for what a smart city or an internet of things enabled smart city might be. And Columbus thought of some really interesting applications. Some of it was around making sure that folks in an underserved community could make it to their healthcare appointments. They have an infant mortality problem. And they thought that technology might be a way to ensure that mothers could get to their doctor's appointments, make sure their children were healthy by getting transportation to serve them more effectively. And if they missed an appointment, finding ways to follow up and rebook automatically with the healthcare provider so that they could lower the burden of getting people back to the doctor. I think that's the kind of real application that matters. And from those applications, then standards come, right? Use cases matter. It's not just about having a sensor talk. And sometimes sensors get sick or sometimes because we're the Department of Transportation, birds build nests on sensors. So we've got these sensors at the side of the road that measure temperature and rainfall, for instance. But if a bird builds a nest, the data gets ugly, right? And it becomes something that you can interpret very easily. And that kind of stuff is stuff that we, you're never really gonna overcome because it's in a physical phenomenon. You can control things inside an app. But when you're out in the real world, you have to be able to adapt to dirty data. We work very closely with the states and with industry to try to set these standards. There's a lot of committees inside the department that work together to make that happen. Dan, this morning in the keynote, David Portnoy talked about the open data mandate and he said one of the challenges is it's unfunded and there's not a ton of direction. You're the first CDO in a cabinet level. How are you addressing this? How do you see kind of more CDOs funding? What is the mandate that you get? So I am responsible for making sure the open data mandate happens for us. To your point earlier, we've already paid for a lot of technology to get the data in and to turn it into something. We may need to make smaller investments to make that data more useful. But governance is not a new mandate. Governance has been around for a while. What's happened is because we haven't been doing it as effectively as we possibly can. And with a lot of organizational structure, the activities tend to defer to the non-experts inside agencies. So they tend to fall to a program manager or to a lawyer sometimes. And what that really means is we just need to work on our training, our education, and getting some unity of effort. So for me, data leadership is really around taking the folks that want to do this work and getting them organized so that they can be most effective and giving them the tools and the skills that they need to make their data useful. The open data policy has a sort of second title, nobody reads the second title, managing information as an asset. And I have a lot of theories about managing information as an asset, but when you talk about other assets like technology or people, those things are supposed to be productive. They're not just something that you take care of and maintain in a good condition. You do it for a reason. And that reason is to allow people to get the most out of that data set and be able to blend it with other data. And that's what it's about for me. You collect a huge amount of data. How do you manage it? Where do you store it and what do you use to manage it? And is that changing now with the arrival of technologies like Hadoop and Cloud? Is that changing? I think we're at the beginning of that change inside the department. We've got one of every legacy technology you could possibly name. There's a lot of technical debt in government, but there's a lot of technical debt in other industries, too. Finance and insurance all have these same kinds of problems. We're not unique. People talk about us more, I think, is really what it comes down to. And really, I think managing that technology transition first begins with giving people the ability to do scrappy prototyping. So we've put together this innovation sandbox where we've pre-funded a small amount of cloud services and people can form a hypothesis about what technologies they want to test and what they'll be able to do with them, spin up those resources for about four to six weeks, do their experiment, and then document their lessons learned. We make that as widely available as possible, and we have a lot of forums from which we can share the lessons learned so that more people can continue to try or to not try things that didn't work. And I think that's really about just getting us in that experimentation mode is key to making the cloud and the Hadoop transition. There's other administrative things. My chief information security officer gets nervous every time I say cloud, but we'll figure it out. Even though the DoD is making a huge move to the cloud. Yeah, all across government, we really are. And we've got the tools and techniques to help us do that. But we just need to build our own processes to be a little bit more repeatable and less of, what is that again? It's not always no, it's like, huh? But we'll figure it out. Well, best of luck carving out your role, defining a role that other CDOs can emulate. We're out of time. Dan Morgan, thanks for joining us. And this is theCUBE. We'll be right back shortly with the rest of our coverage from today's MIT chief data officer and information qualities symposium.