 And welcome back everybody. My name is Jeff Kelly. I'm here on theCUBE live at IBM IOD in Las Vegas at Mandalay Bay. As you can see, you're a co-host for the day. Dave and John have stepped aside and actually handed over the mic to me which is a rare thing these days. But I'm happy to be here and we've got another great segment coming up today. Right now, as a matter of fact, Tim Vincent, CTO of the Information Management Group. Welcome. Thank you. So, Tim, we were just talking a little bit before the segment. So you're working on a lot of big data projects specifically around the pure data systems. You're located up in Toronto at the Toronto Labs which I know has been doing a lot of really interesting work for a long time. There's a lot of interesting softwares coming out of that lab. So why don't you tell us a little bit about your role and then we can kind of go into what you've been seeing around big data, specifically around pure data systems, et cetera. Okay, so I'm the CTO for Information Management as you said. And in that role, I really sort of have a purview over the technology and some of the architectures and strategy for these architectures and information management. And that runs anywhere from your traditional, transactional database all the way to the new big data paradigms. And I've been spending a lot of my time this year on the pure data systems. So, for our listeners, our viewers who might not be real familiar with the pure data system, which is a pretty new offering, kind of builds on them to tease the technology. But what's really, what's the, it's essence, at its heart, what is the goal? What is it trying to do? Well, let me start, it's much more than a teaser. So we actually have three offerings. So, there's the pure data system for transactions, pure data system for operational analytics and pure data system for analytics. So I'll go through sort of each of these, but let me describe what we're trying to achieve. So, if you look in the industry, it's all the IT areas are getting caught from a CUS perspective. And at the same time, they're being asked to really deliver a lot more capability, you know, while they're having these cost squeezes. So there's a lot of time that goes into just engineering systems, making them stand up and run. Forced research did a study where they looked at IT and they found that 34% of IT projects actually deploy late. And as they, as you look at- What was the number? 34%. 34%. I would think it'd be a little higher than that even, but it may be. We'll go with 34%. And when they looked at this, there was a lot of cost in just the initial setup and the configuration and ongoing management of these systems. When they looked at the data and started breaking it down, 45% of the cost often was just in setting up the hardware, integrating it, configuring it, tuning it. So a lot of cost in this, and it's, you know, while you're doing those things, you're not really delivering any value to the business. It's really just lost cost. And they also found a similar number, which is about 35% of the cost, went to similar things in the software there, configuring it, setting it up, tuning it, et cetera. So we wanted to take those costs out of the equation. And at the same time, you want to have a system that's going to give you gliding fast performance at a low cost point. So, you know, performance is a funny thing, because anyone can get good performance by throwing lots and lots of hardware at a problem. But the trick is to get the performance at the low cost point. So we wanted to give that paradigm as well. And when it came to the operational, the transaction system, we were also really going after the private cloud type of space. So in that paradigm, what we're seeing is you have your traditional system of records workloads. And there's an emerging space with people who are coining systems as engagement. And these are really interaction systems that you're seeing directly with consumer. Mobile applications is a great example of these. And people in that model want to get into this paradigm where they have self-provisioning, so the application programmers can get things quickly. They can provision and deploy. They can manage and modify. They don't want to have to go to a central team to do this. They don't want to open a PMR and wait weeks to get the system. They want it now. So we wanted to build those characteristics. So to sum it up, is systems that are integrated for performance, integrated for cost, integrated for cost and ongoing management, and also to give this kind of private cloud dev ops type of paradigm as well. Interesting. So what is your take on now? You've got kind of different, it's kind of the appliance model. But what is your take on kind of bringing together the transactional and the analytic in one platform? How do you make sure that these aren't silos of data? And how do you make sure that they're talking to each other and that there's that kind of feedback loop between the transactional systems and the analytic systems? So if you look at it, and this is the way the industry's been doing that type of problem for a while. So the transaction systems are generally the source of the data that you're running on your analytics. So you've got these systems and they're running along, you're creating new data about customers, about products, about sales, et cetera. And then what you do is you take the data out of those systems and you bring that into your analytic systems through ETL processes, through streaming processes. And what the industry is really looking to do and more and more is actually get much lower latencies of that data. They want to be able to do analytics on the data as close to the point of origin as possible. So they're moving this data through ETL layers, through streaming. So when we talk about big data and you talk about three Vs in big data, one of them is the velocity of data, right? So you have the interest for a streams product. Now we have one customer, a large telco, that's actually using that to actually take data directly off telephone switches. So call data records, come off the telephone switches, stream through the interest for a streams product. They do a level of analytics on there using some accelerated technology we have that does things like CDR deduplication, CDR mediation. And by doing these operations in this flow control, you are actually offloading things off the analytic system so you can do things like say online billing for telcos. So they're really tying all these systems together and it becomes an integration exercise in a governance section of exercise as well because you want to really understand where things have come from, what's the relevance of the data, et cetera. You know, and it's interesting because certainly, yeah, the old paradigm that you talked about, kind of the, you move the data from your transaction system, maybe it's daily and it's a big batch load and maybe the next day you run some reports and that's, you know, that doesn't, you know, our kind of research thesis here at Wikibon and SiliconANGLE is that that's really not going to cut it in with the pace of business today. And, you know, we're seeing a lot of in the big data space, seeing a lot of so-called connectors between databases, kind of taking your Hadoop installation, maybe throwing on an analytic database to do some of the more real-time things. You know, and our belief is that that really is not a long-term fix, that's a short-term situation. We're seeing some startups come to the fore like, come be like Hedapt, just building kind of a Hadoop-based system with SQL real-time and big data. What's your take on that? Is it, are we moving towards the point where we're going to deliver on a unified, big data platform where everything kind of lives in the same system and the days of connectors and kind of moving that data is a thing of the past? Or where are we going then? I think longer-term you may see that happen, but I think in the medium, short to medium term, you're going to still see a level of data movement. And what we're seeing in the enterprises when you talk about Hadoop, one of the things that we see is they're using that platform, big insights type of platform to bring all the data together initially. And why are they trying to bring it together? Because they've never really been able to do so before without modeling it, et cetera. So they bring it together, they've all these attributes, and then they're going through in a level of exploration, right? And then they're trying to determine, okay, what do we have here? What kind of questions we have? And as they're doing that, they're doing a level of transformation on the data. And then they may reach a point that says, okay, this is really, really useful insight. We'd like to now make that much more of our operational process. So then you've got this complete linkage of, okay, I started here, gone through these transformational steps, here's the job I've created. I want to now generate a direct path from here to my operational systems, which could be your operational analytics system or your analytics systems. I think that's the first step you're going to see. And then you'll start seeing people, this concept of not moving data, something very, very important to customers. And I think what you're going to start seeing is people are also going to want to try to do things as much as possible at the source of data. But it doesn't mean they're not going to still have to have aggregation engines, et cetera, right? So it's not that that type of workload is not going away. It's simply being, in your opinion, kind of augmented with these other types of workloads we can now do that we couldn't ever do in the past. Yeah, one of the things that you see in a lot of enterprises is it's hard to get rid of legacy. You've got a lot of investments on this and people really want to evolve out from there. So, IBM mainframe, so you know, been around, and... Just look at the number of core ball applications still in the world, right? Yeah, that's interesting too. I mean, how does IBM view, kind of, we're here in the big data pavilion, there's a few interesting startups around here, Datamir, some others in the Hadoop space. How does IBM kind of view that growing ecosystem? And how do you interact with it? And kind of, you know, because I'm sure you're getting inquiries from customers who are, you know, hear all the noise around big data. What's your strategy in terms of, from a technology perspective, partnering up and working with some of the smaller startups that are doing some of the interesting things out there? I think it's an important aspect. We live in a highly heterogeneous world, right? And very few customers have a top-to-bottom stack that's just one technology. So it's absolutely critical to link these things together, but when you go into a customer base, they may have a cloud-air system, they may have, you know, different ETL engines, et cetera. So you've got to be able to plug all these things together. And there's also a community around each of these that's incredibly important, right? So if you don't interface, if you don't integrate, you don't embrace those communities, they will ignore you. So I actually think this is really an important aspect. And we absolutely aren't trying to embrace all those. Right, so can you dig in that a little bit? So what are some of the things you're doing? What are some of the more interesting, you know, with the big data community? What are some of the interesting things you're seeing happening? And maybe some of the things you're working on, specifically. So let me give you another example where we actually stretch the definition of big data a bit. Okay? So let's talk about no SQL databases. Okay? All right. And then I'm going to link that back to the system of engagement model again. So what's happening there is, again, you've got developers trying to build applications very quickly. They're looking at different forms of data. And one of the things we see is in this community you're seeing a lot of people building out around JavaScript, so it's in mobile devices, it's in servers, so it's everywhere. And they're using different technologies to write applications. They're using things like JSON as a data type. And one of the reasons they're using this type of data is because it's schema-less. You're controlling the scheme up in your application logic effectively. And why is that important? Well, one of the things that gives you is it gives you this capability of delivering rapidly and changing it rapidly. You don't have to go back and change your fixed schema. That whole idea of the enterprise data warehouse taking maybe 18 months to model the data, get it set up the way you want it, and then next thing you know, the questions you want to answer are out of date. Exactly, so what we're trying to do is take some of those concepts, like JSON, and add that into some of our core systems. And then embrace that community around that. So they have the capability of starting out in that paradigm, but when they want to get into a sort of more of an enterprise world, they have that capability of bringing it in, getting all the governance aspects into mind, all the operational management aspects that people care about, compliance laws, et cetera. Right, which some of the hackers out there aren't necessarily concerned about that when they're kind of brainstorming and putting together their apps, but certainly when you bring it into an enterprise environment, that's very important. But you touch on an important part of the ecosystem that we're not seeing a ton of activity around and that is the big data application space. We're starting to see things happen. I mentioned DataMir earlier here. They've announced today actually an online application marketplace, but really we're just starting to see kind of that marketplace start to take shape. So my position is that I think we need to see more agile development tools come to the fore that allow you to bring together data from different sources quickly because it's certainly not a situation where, as you said, developers want to create their applications and get them up and running fast. This is just like you don't want to do your analysis a day late. You don't want to wait three days to get your application up. You want it up today. So what are you seeing in that space in terms of some of the applications that are interesting to you that are being developed or the application platforms that you might see out there? Let me, you made a point there that I think is important. I want to get into that before answering the other question. And you talked about tools. And I think that's in this space, that's going to be one of the areas that's going to be incredibly important. You know, sometimes the industry focuses too much on the runtime engines and how you're processing the data. And really the tooling on top of that allows you to build out the applications, manage the applications, manage the data, understand the data is going to be more important. So we've got to get that ecosystem of tooling around it. Yeah, absolutely. And so let me use that then to go into your next question. So what could be an example of type of things that you have to do? Well, one of the examples is as you're trying to bring all the data together, one of the first things you have to do is understand what kind of data you have today. Right? You know, that's a big problem for a lot of people. Sure, that was hard in the old days of tradition. And that is a ton, tons much harder, yeah. So what kind of tooling could you build and what kind of applications could you build that helps you understand that data? That helps you understand the assets you have on that data. And then once you've got that information, that can help you actually build things and have more of a compositional model, right? It's not always, you don't always have to build from new, it's, you know, you've got a lot of assets. How do you build out on those? I think that's going to be an interesting area. Yeah, I mean, I couldn't agree with you more. I think, you know, most of the talk around Hadoop has been around the kind of the infrastructure, the plumbing, and of course that's very important because without that, these applications would be possible. But really, when you start to talk about value, especially to the business side, they're interested in, you know, how is this going to impact my business? How are we going to solve business problems? That's about applications that actually do solve everyday business problems or whatever the case may be. So I couldn't agree with you more. So I think it's an important time in this space. You know, there's a large analyst firm that won't be named that talks about the trough of disillusionment. And in order to kind of get through that, I think we've got to start to see that kind of take shape a little bit more. That's where we're going to see some disappointment if we can't bring this promise of big data actually through to the application layer and deliver it to people in interesting ways. It's all about delivering value in the end. And if you're not delivering the value, all the infrastructure in the world is meaningless. Yeah, we had Pauline Nist from Intel on earlier stage. She said something very interesting and put it in an interesting way. She said, big data is a business solution, not a technology solution. I mean, is that how you view it? Yeah, absolutely. So, you know, you're a technical guy, but when you do talk to the business, how did they probably, you know, the business side does not interest in Hadoop, whatever you want to do, they're not interested. They want to know how you can solve my business problem. So, how do you go about talking in the business side about all the interesting things that are happening in a big data space and translate that into a language that they understand? Initially, you don't. Seriously, you first, you try to, I think one of the things, you know, as I've grown as a technologist, I always wanted to start with technology, you know, and you would have a discussion with a customer and they'd say, I want you to do this. And it was really important. I don't want to do that for you. I want you to really tell me your problem. So, more and more, I focus on going to explain to me what you're trying to do, what your problems are. Once you understand those problems, then you can actually start helping them understand how you can actually solve them. So, I think it's really important to start the discussion from what are you trying to do? So, that's the way I would have that discussion. Well, we're just about out of time. So, last question, we'd love to hear, you know, love to hear predictions. So, we're going to, we'll put you on the spot a little bit, but, so we're here, if we're here talking at this table in 2013, what are some of the things we'll be talking about in terms of the work you'll be doing over the next year? Where are you really putting your focus on, or some of the more interesting things we can look for? Well, we've touched on, you know, some of them already. I think it's that ecosystem, the tooling around this big data concept that really starts allowing you to unleash the value of it. Allowing you to understand it, build applications, build an ecosystem around those applications, be able to make sure that when you build an analytic application that you can run that in more than one place, right? Because let me go back to that CDR example. You know, some customers may want to run that in a streaming system and do those operations there, but there may be other operations, other PMML type of things, that you want to run anywhere. You know, so how do you actually build these applications in such a way that they can be deployed against multiple data sources? I think it's going to be one of the trends. Really interesting, all right, great. Well, Tim, thanks so much for coming on. Really appreciate it, first time in the queue. Thank you for the time. We hope you'll come back. Tim Vincent, IBM Fellow, CTO, the Information Management Group here at IBM, we're at IBM on demand. We're wrapping up day one of coverage here, live on theCUBE, SiliconANGLES Premier TV, live TV streaming platform. Thanks so much for watching. We will be back tomorrow morning with another full day of coverage. We hope to see you there.