 From the campus of MIT in Cambridge, Massachusetts, it's theCUBE, covering the MIT Chief Data Officer and the Information Quality Symposium. Now here are your hosts, Stu Miniman and Paul Gillan. Welcome back, this is theCUBE at the MIT CDOIQ Symposium. I'm Paul Gillan here with my colleague Stu Miniman and we've been talking a lot about the role of the CDO where that job fits within the organization, particularly vis-à-vis the CIO. Is the CDO a threat to the CIO? Does the CDO eventually become the successor to the CIO or do the two of the jobs exist as peers? We're going to hear right now from a CDO and a CIO at the same organization with us, Danielle LeBeau, the CIO at GlaxoSmithKline, big pharmaceutical company, and Mark Ramsey, the CDO at GlaxoSmithKline, newly created positions, only been there for a year, so it's still being defined. They presented a session today on dynamics between the CDO and CIO, and talk about those dynamics. I mean, when this job was being defined, let's start with really both of you, when this job was being defined, Danielle, you were already in position. How did, what was the goal and how was this job defined relative to what your responsibilities are? So everything starts from a vision from the head of the R&D, Patrick Valence, who strongly believed that R&D in pharmaceutical, the way we do R&D will change over time, and analytics and insight from data will be key in the coming years. So the first experiment was to ask external companies, a big data company in the Silicon Valley, very well known to do a series of pilots for us, where they in fact extract the data from domains that we were never putting together. So we are used to extract data from clinical data, to extract data from regulatory, to extract data from discovery. We were not used in the past to extract mixed data from commercial, safety, clinical, regulatory, including external resources like scientific literature. So the first experiment were about mixing these different domains and some of these experiments were successful. Then we decide that instead of having all these analytics outside the walls of GSK, it would be much more important to have that as a foundation, as a basics of our discovery. Then came the decision to have this internal big data analytics cross domain in R&D. And to accelerate the speed, to be able to do that project in let's say one year, one year and a half instead of five year, it was clear that we have to recruit somebody who has the expertise, plus the knowledge, plus the trust, plus the contact, and that's why Mark came. Why was this not a role for the CIO to take on? In some sense, CIO, the way I find my role is about getting the maximum from process and data. And I was explaining this morning that process and data are two sides of the moon and if you just look too much on the process and forget data is a mistake. So it's part of the role, but in this case, first, data is not the ownership of the IT department. Data is the ownership of the business point one. Point two, it's extremely important to link that initiative with a kind of R&D top initiative. So in some sense, analytics, and it may come back a role of the IT, perhaps in two, three years. I don't know, I don't care too much. I mean, the question does not come in these terms for GSK. The question was, how can we together achieve a huge objective at the highest speed? What Mark, what appealed to you about this opportunity to really come in and define a whole new role in a very large company? Yeah, I think Danielle touched on it. I mean, one of the things here that I look for was really the sponsorship and the executive support. I mean, we talk a lot about that where it's extremely important to have senior management support in an initiative like this because these are painful and sometimes costly projects and it really does require strong support. And so in this particular case, as Danielle mentioned, Patrick Valance, who's the president of R&D, is a very strong supporter. The senior executives are very strong supporters and that to me was quite appealing. And it was also interesting because I've done a lot of work in many different industries. This is a very unique industry because you're actually helping patients. You're changing people's lives. A lot of the work that has been done as it relates to big data and many of the CDOs are in banks and in the retail companies and in telecommunications, which are great organizations, but it's about sales and marketing. It's around selling another product, keeping customer loyalty. This was very fascinating because it was an opportunity to help transform the R&D organization. And within a pharmaceutical company, R&D is a very large part of the organization. It's not a little piece that just happens to come up with a new idea. It's fundamental to the operation of a pharmaceutical company. So it was quite appealing to take the knowledge that I've built up over the last 25 years and apply it to a new industry so that I actually get to learn along the way. So one of the big challenges of the data these days and for quite a long time, security, where does that fall between the purview of the Toe View? Well, it's both, actually. I mean GSK has a chief information security officer that's part of Danielle's team and security. We talked a little bit about that earlier today where one of our security mechanisms, unfortunately, is data fragmentation. So we have a lot of data that's fragmented and that does can actually act as a barrier for folks to get to all of the information. With the creation of a standardized platform, making it easier to get to data, it's something that increases the importance of security. So we've been very focused on making sure, I said earlier today that the CISO is now my best friend because it's extremely important to protect the information and have the appropriate levels of security because we're eliminating that fragmentation. Yes, so security and data, did these become board-level discussions at GSK? So security is board-level per the nature of the board and it becomes more and more. So on the data, I cannot say it's a board-level discussion. It's an executive team discussion. It's not necessary a board-level discussion. I think everything will depend on two aspects, how fast we can demonstrate the value of the experiment we do. That's very, very important because then it will become a board discussion. And also don't forget that extraction of insight from data is, of course, quite important in R&D but we have also areas in the commercial part of the pharmaceutical companies. Mainly the consumer health care organization where, of course, we try to get the maximum insight of our data. And that is to be clear the discussion of the analytics in consumer health care has been a board discussion. But I think it's also where the data itself is not the key here, right? The value that you can derive out of the data. So I think the board-level discussion is around key strategic initiatives of R&D and what the leadership team has recognized is the way to achieve those objectives is through using data and driving more strategic uses of the data. So it's not that the board says, well, we need these attributes. I mean, it's really around trying to achieve those objectives that are set out. Yeah, Mark, I'm curious. What kind of feedback loops do you have in the data to kind of improve processes and create new products, things like that? Yeah, I mean, I think step number one what we've been working on is really, quite frankly, GSK has grown up through mergers and acquisitions, which means that it creates a fragmented data environment that adds to the complexity of actually using data as a strategic asset. So one of the key focuses now is making that data more available across R&D for these strategic purposes. And as part of that, you learn a lot about the data. You learn data quality items. You learn ways to bring data together and simplify some of the applications. And all of that is fed back to different parts of the organization. But as Danielle mentioned earlier, the data is owned by the business, which means they have the responsibility for actually addressing data quality issues. And so that's something that we are very keen on is having the business own the responsibility for the quality of the data. It's not an IT problem. It's not a CDO problem. It's something that the business really needs to own. That's a theme we've been hearing a lot today. Well, now on a day-to-day basis, I would assume that the two of you talk a lot. But is that really the case? I mean, do you meet every day or are the functions more separate than that? No, we don't meet every day. I mean, there is interaction of the organization of Mark with the IT expert for the extraction of data and from some aspect of infrastructure, which is done on a daily basis, but more and not directly with you could be with your direct report. I mean, we meet in the different steering committees and there are steering committees. And together, by the way, with the R&D organization, and it's mainly to fix the priorities in terms of what are we going to do first? So what are the different sub-projects in this project? Which data are we going to feed first or something like that to fix also the ambition to address like any steering committees, security, find the right balance between, at the end of the day, steering committee, finding the right balance between speed and quality and between security and transparency. One of the things that we're doing is we've really established the R&D Data Center of Excellence as a startup company. You mentioned earlier about GSK being a large complex organization, which is exactly true. We've really established this program as to behave like a startup company within a major corporation. And so in many cases, that means we identify areas where things operate quite well and they're very structured and process-oriented that work well in some regard and may not work so well in what we're trying to accomplish from a execution perspective. So I think sometimes I talk to Daniel more than he probably wants, where we're trying to move the problem up the chain faster than what we might do with a typical program. So at this event for the last few years, one of the questions that people had is, is the CDO, is that where innovation lives in the company and therefore is that a threat to the CIO? How do you see that dynamic working? Well, I mean, again, the CDO, I don't think, I mean, what we're doing is driving innovation around using data as a strategic asset. And that piece of it, we're working in partnership with IT to make that happen. And there's many other innovations that have nothing to do with data that live in various parts of the organization. So I think it's a little unfair to say that innovation lives with the CDO and the CIO gets all the old stuff. I mean, that's not how it works. I think that shows well in articles, but in reality it has to be a partnership. It has to be, because we're building something real, right? We're not doing a little trial off to the side. We want to build this into the infrastructure that's in place within GSK, which means we need to work through, that doesn't mean that there aren't little tensions here and there, but we have to work through those and actually move this thing forward. So give us some examples of what you've been doing. I realize you've only been at this a year, but what kinds of success are you seeing? What projects have show promise right now? Yeah, and what we've done to, I mean, in a data space you can spend a lot of time, right? So the risk is always that you invest one year, two years, three years, trying to just clean up the data and make it available and you really don't have much to show for that. So we definitely did not want to take that approach. And so the approach that we've used is we've identified use cases. So these are specific actionable items where we can bring certain information together, make it available that hasn't been available in the past and it allows scientists to make decisions on that data that they can't do today. And we had a portfolio of almost 100 use cases that we boiled down to the top 10 and we used those 10 to sequence the work. We wanted to have a balance of short term, like three month and longer term, maybe six months. And those are the actions that we're taking and it's a combination of bringing external data together with internal GSK data. A lot of that effort is around getting the platform in place. So we now have a platform in place that allows us to house this information and make it available to the scientists. And that took a period of time. And now we're loading data and making that available to the scientists and we're doing that with these use cases which each have defined value. And so it's kind of the heavy lifting work right now. It's bringing fragmented data together, rationalizing that information, making it available to the scientists so that they can make better decisions. And as we go, the data volumes will get larger, the problems will get more complicated, but we will keep building on those success so that we stay away from the build it and they will come type of an approach. And we also stay away from taking too long before we deliver value. I wish we could talk more. There's so many dimensions to this, but we are out of time. Thank you again very much, Mark Ramsey, CDO, Danielle LaBeau, CIO of GlaxoSmithKline, real treat to have you on theCUBE. Thanks for having us. This is theCUBE. We'll be back in a moment. Hi, this is Chris Devaney from DataRobot. My name is Nensha Bartoliwala. I'm the co-founder and chief product officer of Paxata. Hi, this is Wei Wang.