 Hi, we're back. This is Dave Vellante with Paul Gillin. This is theCUBE, Silicon Angles, continuous production of the MIT Information Quality Symposium. Tony D'Onofrio is here as the Chief Technology Officer and Senior Vice President of Technology at Truven Healthcare. Tony, welcome to theCUBE. Thank you. So start by telling us a little bit about Truven, and then we'll get into your role. We're going to geek out a little bit here. We haven't in the last two days. That's rare for us on theCUBE, but tell us about Truven. Well, at Truven Health Analytics, we deliver healthcare analytic databases and healthcare reference data to every stakeholder in the healthcare community. We deliver solutions for consumers of healthcare so that they can manage their selections of healthcare plans and manage episodes of healthcare. We deliver applications for providers of healthcare, both at the clinical point of care that provide critical information for making decisions on caregiving, and we provide applications to payers of healthcare, both government, Medicare and Medicaid, as well as commercial health plans and large employers who want to understand how to best deliver efficient quality care to consumers of healthcare. And talk a little bit about your role as a CTO. Is it a visionary role, strategy role? Get involved in actual implementation, R&D decisions, all of the above? Yeah, I'd say all of the above. We're what might be called a mid-sized organization, and we're at about 2,100 Truvenites, and we have to manage the development of our software and our tools and our analytic methods, and we have to deliver those. And that's a lot of work because we provide many products across the spectrum of product lines to those communities. But we also have to get ahead of the curve, and that's an important part of my job, is to look at what are the opportunities that technology affords the healthcare community, and how can we extend and evolve our services so that we can offer, or enable, I should say, the healthcare, all of the healthcare stakeholders, the ability to provide higher quality care to more people in a more efficient manner. Okay, so you're an independent software vendor, essentially, in the healthcare business. You've got a, sounds like quite a portfolio of applications. We do. Consumers, docs, and payers. Absolutely. So are these largely bespoke applications, or sort of grouped in suites by each of those constituents? Yeah, they're grouped in suites. We have product lines around what we call care management, which would be things like our care discovery product, which is for the providers of care to look at the clinical quality of their care, and to improve that over time, using data collections of their own data with benchmarks that we apply to it, and analytics that we apply to it, to give them key information to make those decisions of improving clear. We also provide, in the area of care management, operational applications like Action OI, that allows the folks and provider organizations who are managing the efficiency of the overall operation as an enterprise, using financial data and other operational data to understand the improvement of theirs, using benches and comparing to each other, and so on and so forth. And then on the clinical side and the point of care side, we offer solutions like Micromedics, which has a suite of important data referential information such as toxicology and drug interaction information and things of that nature that allows, when used at the point of care, allows clinicians at the point of care to make better decisions and make well-informed evidence-based decisions on giving care. And finally, for payers, we offer what we'll often consider risk management solutions like Advantage Suite that provides analytically ready data over populations regarding the cost of their care and the efficiency of their plans to provide that care. One of the things about analytics that I think is interesting is companies, organizations often find unexpected benefits. So they go in thinking that they're going to increase revenue and, in fact, they find that their biggest pay-off is in operational efficiencies. Are there any stories that you can tell of your customers where they have seen perhaps unexpected windfalls from applying analytics to their business? Yes, you know, there are, you know, we have many cases, you know, where we have clients who have accrued unexpected benefits, you know, from our analytics. I won't name any client cases yet. However, we do, we have in the area where we do risk management, we have many cases where our surveillance and payment integrity solutions have delivered information that has allowed payer organizations to recover well into the tens of millions of dollars per period of reimbursement. So the benchmarking piece is interesting. So how did that come about? I mean, obviously you have the chicken and egg problem there, right? So you had to have enough, you know, data to be able to even offer those benchmarks. There's also the, you know, the confidentiality and privacy issues. You can talk about that a little bit, how you came up with that product and have successfully delivered it. Yeah, that's actually a great question because, you know, depth and breadth of data is really important when you're providing analytics. And we're fortunate because, you know, having been around while as an organization, we've been able to pool large collections of data in such a way that allows us to look at individual consumers, individual patients, if you will, over longitudinally over a long period of time. And to your point about privacy, the reason and the way that we're able to do that is by pooling those data and de-identifying those data. So the important attributes about the episodes of care for those communities and all of the information that we have is de-identified and is pooled in a way that allows these benches to be created without providing individual, you know, personal identifiable information. So, but nonetheless, well, even though you're anonymizing it, did you have to sort of work with the providers of that information and say, okay, we're gonna do this or did you just kind of do it and ask forgiveness or? Yeah, that's a good point. Data rights management is a critical capability for us. We actually consider it, you know, we call it supply chain management, the way a manufacturing organization would. We work and those arrangements can be complex. We have, there are variations on the theme, but long story short, with large portions of the communities that we serve, we're able to make value arrangements with them whereby they allow us to utilize these data for benching and for aggregation and pooling. And that benefits, you know, the whole community, but including, you know, the direct participants in those arrangements by getting the benefits of those broader benches. So what will I see? I'll see myself in context to a larger pool and then there's presumably some granularity of that pool, some categorization, is that right? Yeah, you know, what you would see, depending on your, you know, your use and in which, you know, what application you're using and what you're looking for, you know, what your role is. If you were, if you're an individual consumer using one of our consumer products, for example, for instance, you could look at the list of actual, you know, episodes, claims, if you will, that you've incurred over the period that you've participated or your employer has participated in providing this application. Or you could see information that aggregates that and gives you some suggestions and analytical information about how you might approach, you know, that condition. So you would see both in that case. If you were the plan provider, you would not see granular data about the individual user, but you would see summary information about your population and summary cost and care information about that population. What about the payer? That's sort of the most interesting to me is, I wanna know, how do I stack up against my competition, essentially? So okay, you're not gonna do a head-to-head for me. Right. But what can you tell me? Yeah, payer-to-payer comparisons are less rich, you know, at this point. Sorry. Yeah, it's difficult. You know, we do have a view on it because we're providing services to each of the payers. We have a white paper on your website. You have a white paper on your website calling for price transparency. Right, and we believe that that's critical. It's fair to say that certain stakeholders in the process, especially on the payers side are currently, they feel they're disincented to provide price transparency. However, if you look at it in the midst of long-term, I think the pressure will be there to do that in order to enrich their plans and attract the most, you know, the best mix of consumers. Gas stations have to do it, but the insurance companies don't. Yes, that's right. The, I wanted to ask you about it, as far as that Bustashari was on from Health and Human Services earlier, he was talking about his vision of a world in which every interaction between doctor and patient becomes part of a bigger database that contributes to the body of knowledge. It furthers our ability to make smarter medical decisions. Now, part of that depends upon companies like yours being forthcoming, I guess, with that information, anonymized, aggregated, however, but is there, do you see in the future that you will be sharing the kind of information that you collect with other companies like yours in the pursuit of the greater good? Yes, I do. I think that as the industry develops and as capabilities and technology and data interchange move forward, that will be more of the case than not. I think in the very long term, what we'll see is what one of my colleagues, Donald George, and I were talking about and I like his term earlier, as said, is information as a service. And the key to unlocking, providing information as a service will be managing that privacy in a public way, managing that level of access so that only aggregated or anonymized data can be accessed in public setting, but data can be privately drilled down on if the right access makes sense to grant. That will require a standardization, a massive standardization initiative because companies like yours and like insurance companies will need to actually have a consistent set of data standards to permit this interchange. Is that right? Yeah, and the interesting thing about that is that the standards do exist today. There are very mature and continually improving standards for data encoding, data semantics. So on the admin data side or the claims data side, if you will, we have ICD-9, going to ICD-10. On the clinical side, we have HL-7 in various releases. The real challenge there and what's going to evolve rapidly, I believe, given the need to become more efficient is the quality standards in those data and the implementation standards. So that certain, for example, certain attributes absolutely must be present in all cases and won't be allowed to be fudged by putting certain data in a text field because within the confines of your system that works for you in that context. Easier to standardize the fields than what people put in the fields, right? Exactly, it's a good way of putting it. Now, you deliver your service or your software as a service. Is it on-prem, both? How does that work? It's both, but mostly and increasingly, it's software as a service. We offer all the products that I mentioned earlier and services around those where we provide analytic consulting and the data management for the individual participants. We provide that as a service and we host those. That's actually very efficient for our customers and for us. It's easy for us to manage the secure. It's easier for us to manage privacy and protocols around the data if we manage it and host it as opposed to doing it within the context and the confines of a customer or a customer partner's infrastructure. But we do that as well. So I wonder if you could talk about the database behind it. I said we haven't really talked tech, but when you look at the delivery of software as a service, three companies come to mind. You got Salesforce, who just did a monster deal with Oracle as the back end. You got Workday, who I believe does a homegrown glio. And ServiceNow is another really interesting one. They got a single CMDB running on MySQL. I think they're moving to either, I think they're going to Cassandra. But anyway, regardless, talk about your database architecture. Yeah, that's a, I'm glad you asked that because we're a little bit interesting that way. We are not software as a service in the same way that you would see Salesforce and ServiceNow, but in some ways we are, in some ways we're not, I should say. So the way to think about the way we manage those data is in a couple of ways. One is we instantiate individual data warehouses or data marts, given these are business intelligence applications where you're doing analytic reporting over analytically ready data. We instantiate individual marts for customer's scope of data. And that's typical for the risk management type products like Advantage Suite and Count Group Reporting and things that really there is a need from the customer perspective to have their drill down data and their data collection isolated. We do that in a shared tenant way over shared infrastructure or in dedicated infrastructure depending upon the customer's policy. But we instantiate in the same way, it's just there would be another instance. The other thing, as opposed to, if you look at Salesforce within a large instance of the infrastructure, they have multi-tenancy on an application layer within the database. So they have table sets that are. Which also of course hammers them on, but now maybe they're going to a new model now. Right, so we have segregation. And workday is probably similar, I don't know for sure. I think they are, I'm not sure either. And so we use Salesforce and we use ServiceNow by the way. And ServiceNow is awesome as it is. Great tool, great tool. And we feel our tools are great too. And in the, but that's the way we instantiate databases. We use commercial platform components like Oracle Exadata in some instances as well as IBM's Cognos. And we have integrating software and optimization software and custom reporting software around that. In addition, we have custom tools to do the data acquisition, management, and delivery to those analytically ready data models. And how about all this, for all this unstructured data? Are you using, starting to dabble in Hadoop? Are you moving in that direction aggressively? We're moving in that direction aggressively, but we believe firmly that it, as with all new storage and retrieval paradigms or data management paradigms, it will coexist with relational database management with other database management approaches for quite some time to come. Kind of like PCs coexisted with mainframes or a little more equilibrium? No, I think kind of like, kind of like relational databases coexisted with file system access and flat file data. You know, it never really replaced that. It did replace, but inappropriately so. In a lot of ways, because it became a paradigm lock. It did replace hierarchical databases, inverted list databases, things like that. Probably shouldn't have for a lot of applications. It became the only way to do things. But Hadoop, in that way, won't replace relational database storage. Where we're applying Hadoop to be specific about that is in the space where, and this is critical for the development of healthcare quality and cost going forward is, where we're advancing our capabilities in integrating clinical data with claims and admin data to provide richer analytics and more near real-time analytics, we're employing Hadoop because in that, because the nature of the clinical data is much less long cycle adjudicated data. It's very lumpy, very volatile streams of those data in terms of time to come in. And we basically feel that the distributed file system approach that Hadoop gives us gives us the ability to basically create piles of those data in very efficient ways and pull piles of those data and apply analytics to them in a much more efficient way. And what database are you using there? We're using the Hortonworks distribution. Apache, so Hortonworks distribution and HBase? Yes, and we are, our view on Hadoop and distributed files is that we feel it should be a lean until we see otherwise with the Cloudera and Oracle Big Data Appliance approaches until we see otherwise or we get to that scale and we understand exactly where and how we're gonna apply it and at what scale we're gonna stay lean with our Hadoop distribution, roll our own using Hortonworks. We do see a future where it very probably would coexist where we would deploy the Cloudera or Oracle Big Data appliance where it made sense. But right now it's something that, a lean and mean distribution to allow us to figure out where we're gonna take it is the best thing. So using Apache distribution which is open source and free, are you actually engaging with the Hortonworks on a subscription basis at this point? Yeah, what we're doing with Hortonworks which you're a customer of Hortonworks. We're a customer of Hortonworks and the good thing about that is it takes off the tape. When you're rolling your own and dealing with the open source distribution and so on and so forth and you're maturing your organization's operational and software development capability and using the tools, it really helps to have a partner who give you a hand to hold who's been there for just both development issues where how do you optimize your implementation as well as bite and chew issues like what's the best way to deploy and extend your deployment on your physical platform. It's great to have that hand to hold. It's gonna do that. Yeah, those guys know what they're doing. Hortonworks, for those of you who don't know, spin out from Yahoo and guys like Arun Murthy who are serious committers to Apache Hadoop and what they're doing with Yarn is very interesting. All right, so we're really getting into it now and these guys are giving me the high sign. So all right, Tony, thanks very much for coming on theCUBE, really a pleasure meeting you and sharing your perspectives. I really appreciate it. Thank you, it's been a pleasure to be here. Keep it right there, everybody. Paul Gillan and I will be right back after this. This is theCUBE. We're live from MIT in Cambridge, Massachusetts.