 It's theCUBE, covering the MIT Chief Data Officer and the information live from the campus of MIT in Cambridge, Massachusetts. It's theCUBE, covering the MIT Chief Data Officer and the information quality symposium. Now here are your hosts, Stu Miniman and George Gilbert. Welcome back to theCUBE. We're here at the MIT Chief Data Officer and Information Quality Symposium here in 2016. Happy to have on the program a Chief Data Officer. Go figure. Mike Kelly, who's with the, he's the CDO of University of South Carolina, which Mike has informed me is the original USC. So thank you so much for joining us. Go game cocks. Excellent. So CDO role, you've been there about two and a half years. Can you walk us through what, what's your background? What led to this? Was there a specific mandate or role or funding within the university that led to the position? Sure. USC, we've had some notions of cooperative data governance and data management since about the mid 90s or at least dating back that far. And we had a committee that was in panel to look at data management practices and try to figure out where units could be most helpful in cooperating with each other. But the folks who were on that committee, as they came up with great ideas, they kept running into roadblocks, which were, they had day jobs already. And so trying to get those ideas implemented more often than not they suffered for lack of attention because data governance, data management wasn't somebody's job. And so that group approached our CIO probably around 2013 and asked for someone to head up that committee as a position and try to bring about the changes that they wanted to see. Okay, and can you explain to us just organizationally where the CDO sits compared to the CIO and maybe that balance of power? There's no comparison, I report to the CIO. He is the one who was approached by the committee. The committee was advisory to him according to university policy. And so when they approached him, he was the one who had to make the decision, do I move forward with this? And he's the one who was paying attention to literature and what was going on in business and industry, saw the title of chief data officer and said, that sounds a little bit like what we're being asked to do. This is a little bit more strategic, a little bit more visionary. Let's kind of float that title out there with the data management group and see if this is a direction that we want to go in and got there by him. Yeah, so Mike, I'm sure you've heard there's a lot of discussion. It's like, oh, you know, should the CDO be in a separate group because there could be some conflicts there? You know, who runs innovation? You know, where does security fit in the whole piece? How's that been? How's the relationship? And what feedback would you have for the community as to kind of that relationship? Absolutely. I don't think there is a right place. We've had the conversation since I was about two months into this position as to the goodness of fit between the CDO reporting to the CIO. And it's an open conversation. Do I expect it will be that way forever? Absolutely not. Do I think it will be that way for a while? I don't know. Organizations go through change and universities are probably notorious for restructuring themselves. And I do expect that at some point in the future, someone will say, there's a more logical home. The best thing I can, I guess the best justification other than the fact that our CIO had the vision for the position is that among all of our different lines of business, all the functions that we perform, IT probably generates the least amount of data. If you look at what a registrar and admissions office or human resources function produces, what academics produce with teaching and learning and research and service to the community, IT is more often than not charged with enabling that through the provision of technology services. Data is seen rightly or wrongly as a form of technology. And since we aren't the mass producers of data on campus, we're the custodians who receive it. It's probably a safe assumption to say, in some ways we have the least dog in the fight and maybe some objectivity about how to give governance direction and guidance to others. So, Mike, you describe something that's actually really interesting in that you have responsibility over all the data but not a lot of authority. And I'm listening to that. And I'm also guessing that because we're in a distributed computing era, there's a whole lot of proliferation of data sets. And I don't mean different data sets. I mean, data that might be for one science experiment used in another or registrar data used one place and another, like at the very, very basic, how do you establish one source of the truth? Okay, I kind of walked back away from the single source of truth. There are certain places where that's applicable but what we want more often than not are explicable forms of truth. If you have something, if you have your own version of a grade point average, let's say, I think most people can relate to that as an example, then explain why your grade point average within your college is different from the overall university GPA. If you're within a major or a minor and you're looking at a GPA, explain how that is computed differently in a way that folks who are receiving the information will intuitively understand and go, okay, I see why this kid's got a 3.25 GPA on this report but they've got a 3.6 GPA over here. It's a matter of what courses we happen to be looking at. So that sounds like, almost like you're gonna log all activity that transforms data or touches data and that so it's not that you have to have a single source of truth but you have to have a log that accompanies all the changes to data. Ideally, we're moving in that direction. Do I think we'll ever check that box and say it's done? No, the nature of data and information changes too quickly but what we do wanna do is make sure that folks are collecting or maintaining the data that they really legitimately need to do their job and that they do so in a way that they understand what data it is that they collect. If you don't need home address, don't collect home address because there's perhaps very limited use for that if the students living on campus or if they have moved in town, collect only the data you want and be responsible for the things that you need on a daily basis. So Mike, how do openness and security fit into your job function? We talked to a lot of the federal people they've got the open mandate and I would think from a university standpoint there must be certain things that you wanna share with lots of other universities. So how do those two fit? There is a gap there, right? Faculty who do research, if they're funded by a federal agency and that federal agency for its grant awards has an open data mandate, then that what we call principal investigator who receives and administers the grant, they're responsible for making sure that they comply with that federal mandate for the particular grant or award that they're running. We would consider them a data steward for that particular grant and the data that it's collecting and maintaining. On our administrative side, we do not have what I would equate to open data policies. We absolutely have state reporting requirements, federal reporting requirements. We have bond ratings by Moody's and all these kinds of things. We have US News and World Report, ranks colleges and universities. Those are places where we are asked to open our data but more often than not, it's factoids about the university, what's our tuition, what's our head count, what's our faculty to student ratio. And so we're very open and transparent about those things but they're usually calculated or generated at an aggregate type level. Maybe break down or pivot out to by major or by college or school within the university. Okay, and security, anything specific on security? Absolutely, security and data governance go hand in hand with us. I was mentioning to a panel earlier today that one touchstone for both of us is the data classification schema that all data elements, all data assets need to by state of South Carolina law have a data classification applied to them and we put that responsibility on the stewards of those data elements or those data assets. Once they determine the classification of an element or an asset, then that sort of from there flow the privacy practices and the information security practices as well as some of the custodial type responsibilities we have within the division of IT for maintaining that data. So I wanna ask something about what may be the bleeding edge of sort of proliferation of data where science on the research side either through simulation or through experimentation generates like orders and orders of magnitude more data than ever before. And we used to just put, publish something in a research journal and then you could reproduce the experiment or you couldn't and it therefore didn't pass the smell test. How does that work now where you might have hundreds of gigabytes or terabytes or petabytes of data that you wanna enable other scientists to reproduce? Let me be cautious here because like all organizations we have structures we have not tried to dictate or control the data governance of our research initiatives. That's an open topic of conversation for us. It's a path we need to go down. What I can say is that researchers have to look at the mandates if you get an award from NSF or NSA and you're collecting data and there's a mandate to open that data from your research to other researchers elsewhere rather than just publish your own article off of it. They are responsible for making sure they meet those expectations. There are resources to support them. We have a high performance computing initiative at the university. They're also federal and regional high performance computing resources available and it's up to the researcher to work with those resources to figure out which one is the most appropriate. So Michael, I'm curious in your role where you report up to the CIO how much are you concerned or do you touch at all kind of the infrastructure or things like public cloud? Is that involving your day to day life? So one of the interesting distinguishing characteristics of higher education is that, and I'm not talking about data governance here, but the organizational behavior and culture itself is one of shared governance. The faculty exert pressure on how the university's controlled the president, the board of trustees, the students and to some extent the staff and administrators all put pressures on how a university operates. And so we have what we call distributed IT. We have relatively weak central control of IT. Our business lines determine what they need in order to be successful with either being an admissions office or a registrar or a bursar or any of these other functions within the university. They'll go out and talk with vendors at conferences and come back and say, you know, we need a better information system that will help us count students who do X, Y and Z or keep track of these types of programs and services. IT in our model more often than not is about enabling those needs. We do have ERP systems for human resources, administration, finance and our master's student records. But a lot of the functions across the university are really very specialized in their programs and services and therefore data assets that support those areas. Mike, last question I have for you is your first time coming to this event. I know you know some of the people from the community but what's the value in you coming up to this symposium? What's your experience been so far? Should I share with our audience a little bit of the flavor of the experience here? So I think the Chief Data Officer title is sort of new on the planet but particularly new in education, which is my industry. Everybody feels special and abused in their own unique role or organization. When you come here you end up finding out that you really aren't that unique and different and that there's a lot of empathy, a lot of similarity in the experience and therefore a lot that you can learn from other folks who are tackling or have already tackled some of the same issues. All right, well Mike Kelly really appreciate you coming to share the story of what's happening the University of South Carolina and with your CDO peers here at the event. We'll be back with lots more coverage from the MIT CDO IQ 2016 Symposium. You're watching theCUBE.