 Hi everybody, we're back, this is Dave Vellante at Wikibon.org, I'm here with At Stu. Stu Miniman is my co-host today. We've been going wall-to-wall coverage. This is day two, we're at IBM Edge. Inhecho is here. Cube alum, good friend of the Cube. Welcome back, great to see you again. Thank you, Dave. I'm excited to be here. Second year of Edge, we were down at Orlando last year. It's less than half of the folks here, so we're excited to see the growth. Oh yeah, we had to get a bigger facility. 4,700-plus folks in attendance this year. We had 2,000 Orlando. I liked Orlando, but it's been interesting the broader mix of audience here at this event. Yeah, so talk about that a little bit. It's not just a storage audience, right? Oh, far from it. You know, the day one session topics were really about data, cloud, and results. Results around the business, results around the economics, and we talked about data economics, and one of the topics was, you know, there's a tipping point. You've heard of the tipping point, right? Sure. At some point, what happens? Well, what's interesting is, historically, as data volumes increased, what would happen? Performance systems would... Degradate. What we're saying is actually, the notion is, as you add more data, you actually get more context, and as you add more data, if you leverage capabilities like flash, the economics change, your performance fundamentally increases, and the cost actually decreases. So that's the new tipping point. Yeah, and we heard a lot of talk, it was all chat on Twitter about, you know, the dollar per gigabyte should really be focused on dollar per IO, and I sort of chime in and said, you know, really, the metric should be about business value. Yeah, so I don't think it's IO. I think it's dollar per workload, right? It's really the workload. Are you optimizing for the workload? So one of the questions we actually got from a client is, you know, if you're going to embark on, for example, a big data project around Hadoop, and are you going to use really, you know, scaled out servers, distributed memory, or are you going to have a direct attached storage? And the answer is, it actually depends on the workload and what you actually want to do with it. You know, are you going to use it for in-line production type activities, or are you just doing it to do a sandbox activity just to kick the tires, or just, you don't care about the nature of the data, or you don't care about necessarily the security or backup availability, then that changes. So it's about the economics. We were talking off camera, had Ambush Goyal on yesterday, and you know him well. He's a brilliant guy. Yeah, he really is. And what I love about him is he's embracing this disruption. On the one hand, you got the tube. On the other hand, you got the Amazon, you know, AWS cloud factor, and it's aimed right at his business that he just took over six months ago, and he just smiles and loves it. You know, in his keynote, he talked about he wants to take away the word storage and replace it with the word data, because it's actually what that is the value for most clients, and it's what you want to do with it, managing data and moving up the scale in terms of the value. So you're at the heart of that disruption in he, which is a good place to be, and it's like these traditional lines, even organizational lines are starting to blur, they're changing within the customer base. What are you seeing there? Oh, data is a hot space, right? Everyone's talking about big data. What's interesting is, is I actually gave a presentation earlier today around big data myths. You know, what are the myths? Well, it's only about volume. It's not about volume. We've talked about it. There are other V's, there's a lot more complexity. We've talked about, oh, is big data doesn't include transactional data? That's not true. Most, actually, majority of big data implementations include integrating transactional data. Why? Because that's actually how most of the world operates in terms of business operations, and you want to integrate that with new data sources, new data types to actually make measurable impact at the point of contact. It's interesting the conversations that are coming up and all that we're covering. Yeah, let's talk about that a little bit. I mean, the notion of bringing together analytic and transactional data. We talked a little bit about in the past. What are you seeing some of your customers do in that regard? Oh, sure. Probably a few use cases. I would say a couple use cases. One is data warehouse augmentation is probably one of the dominant use cases where we've seen where clients have said, you know what, I want to either do one of two things. I either want to have a landing area to land a lot of data and use technologies like Hadoop as a pre-process kind of landing area before I target it to an operational data warehouse or data store or database. Or you have it in conjunction with the warehouse or database to do active archiving. Why active? Well, the nature of what we do and how we live, you don't actually know sometimes how to tier the data appropriately and you don't know when it's going to be needed or when that context or the value of the data is going to be relevant to the particular queries that you're going to run at a given time. So we're seeing a lot more interaction around transactional data, around data warehouse augmentation. In addition, we're seeing it, especially in real-time fraud, right? Why? Because what you're trying to do in the middle and interrupt, depending on the query you're doing, interrupt a transaction that may be happening. You swipe the card and before you fully process that transaction to hit someone's bill, you want to ensure that that really was the right person and the right geographic location and that transaction wasn't done 3,000 miles away in another location. You just take six months to figure that out. Yeah, now, I get a call from... Oh, I don't even get a letter. I actually get a text. Six months ago, six years ago, you get a letter, remember? You get a letter. Oh, you might have been hacked. So what happens now? Oh, now I get text. I get an email. I get a call. I especially get calls when I'm traveling internationally. I love getting it via text because then I'm not interrupted as I'm doing it and I can text back and say, yep, it's a valid transaction. I made that purchase. If I'm traveling internationally, when I get a call, it's quite nice. Are you sure you want to make that extra shoe purchase? Yeah. I'm kidding. Sort of. How about streams? What you guys are doing in streams is really interesting. It's sort of a new model, right? Yeah. You're actually helping clients make decisions on data before you even persist that data. Yep. And that's a trend that... Let's see. I think it came out of IBM Research, that technology. Oh, yes. You might have done an acquisition there as well. I can't really remember. No, no, no. Because pure organic investment is one of... This is pretty cool stuff. So to explain to us what this is all about. So how does it all work and what is it and how are you applying it? The genesis of streams actually started with an investment around security intelligence in terms of the use case and really protecting citizens. One of the things we realized and for a lot of different organizations was that they actually have a lot of data in-house. And if they had the ability to make sense of it in real time, that they could actually prevent disasters from happening. So how do you do that? How do you correlate data sets that are incredibly varied, generated by not only video surveillance, text, email, taken into consideration GPS locations, taken into consideration identity of individuals to resolve all of that and then do that as it's coming at you? So if I use a simple analogy like water, water in a river, water in a lake, water in a waterfall, streams is the only technology in the industry that can actually analyze water moving as fast as waterfall and pick out the one piece that actually matters and take out the noise of the waterfall. Whereas historically, if you think about Hadoop and other types of operational systems, it's either a river flow or it's a lake, right? The water's fairly static, meaning the data static. The data may change, but the speed at which it's falling down is not the pace of waterfall. So streams is really bad. Yeah, I wish our CTO David Fleuer was here because actually he's done, looked at some of the big streams out there. And really we feel that this is the intersection of kind of the big data wave and the transformation of traditional infrastructure. So it kind of brings together, you know, what IT was doing and where it needs to go. So it's pretty fascinating technology. We analyze data that's moving, information that's moving in motion. That's actually the value. So why do you need to move the data, right? One of the fundamental benefits is to leave the data where it is and learn to analyze the data in place or analyze the data while it's in movement, even before it lands. So I'm wondering... This is how we process this human being. What's your thought on kind of the industrial internet, internet of things or internet of everything meme that's going on in the industry right now? You know what? Streams is actually one of the core technologies that clients are using to leverage what I would consider the machine-generated data. What you're talking about is infrastructure and there's two dimensions to it. One is process efficiency, operational process, workflow process, asset optimization. Another aspect is most companies are generating a lot of insights, a lot of data, but they don't actually know how to monetize it. And part of monetizing it means you need to understand when you use those insights in context at the point of impact. For example, in life insurance. Or any insurance industry. You know, life-changing moments is when you actually rethink your benefits, healthcare benefits, family benefits, I'm getting married, I'm about to have a baby, I just wrecked my car, I'm switching jobs. In that moment, you're probably more apt to consider your insurance policies than you are at any other time because it's a life-changing moment, life-changing moments happen in point situations. And you need to be listening and active to understand it. Or real-time CDR mediation. Or understanding you're standing next to a mobile billboard and people know that you're going to be more apt to certain types of brands and products. And the mobile billboard automatically shifts based on your physical presence next to it. Okay, so we talked about a couple of commercial use cases. What about, let's talk a little bit more about the application of this technology in things like security, facial recognition, is that something that's actually happening today or is that kind of a future? It's, security is a huge aspect. Security intelligence, fraud detection, really learning to kind of minimize the risk. In terms of what I would consider political and geographic sort of securities in terms of political boundaries, geographic boundaries, there's a company called Global Terra Echoes who's actually leveraging our streaming capabilities. They're embedding fiber optic cables underground and we're sensing that in real-time. So typically to do perimeter surveillance, what would you do? You'd have either video cameras or you'd have people, you know, man, kind of the outside perimeter or you have helicopters come up and down and observe. Now, because of the sensing ability, we can actually see when a squirrel passes, when a human passes, when an object passes and based on heat, radiation, based on weight, size, airflow, who's there and who's not there. And it's a completely different dimension or the ability to analyze something that's going on in one part of the world at the same time, something else is happening because the physics of objects is that we physically cannot be in two places at the same time, right? Not yet anyway. Not yet. Did they do that on Star Trek? I don't know. I don't think so. Even Star Trek couldn't do that. Okay, so one of the talks you're giving here was big data how to get started. So a lot of clients say, you know, big data, big data is big money, but is that a myth? Is that a reality? Is it a future promise? What are customers asking you in terms of the how to get started question? Is it how do I actually get value out of my data? Is it how do I actually deploy the technology? What are the big questions that you're getting? Probably the two questions I'm getting is one around where should I start, meaning what use cases are kind of demonstrating real ROI and results. That gets to your point about value. And then we've determined that there's five dominant use cases that we've seen clients actually be quite successful in starting from. And then the second piece is, do I have the right skills in house to get this stuff up and running? And the skills could be quite varied. Skills could be around, do I have the analytic skills, applied math skills to write both the queries, data mining skills. It could be visualization techniques because once you get to larger pools of data to consume it can take quite a bit of time. Or do I have data integration and quality and governance and privacy skills to ensure that I'm actually putting the right governance around how that's being accessed and touched? So what else is exciting you? Let's see, DB2. It's pretty exciting, right? It's got a nice facelift in April. Oh my gosh. So talk about that a little bit. You got ours? I love it, love it. We've just announced a capability in our delivering it is DB2 with blue acceleration. It is our new in-column memory, sorry, in-memory column store processing. I'm so excited. Can't get the words out. Column or processing capabilities in our database. So unique aspects of it. One of it is what we call adaptive compression. So we can actually compress the data down by an order of 10x. 10x, so like 10 terabytes. Looks like one terabyte. In addition to that, we can have the ability to keep most of that data in the compressed state and run the analytics on the compressed state. Most analytic tools require you to uncompress the data, recompress the data so you really don't get the storage savings or the time efficiency when you're running the queries. The second piece we've seen is what we call leveraging the capabilities of the hardware around vector processing. So you can do single instruction and then drive a single instruction for a query and then run it across the multiple threads of a system. That's also enabling you to quickly get the results back. The other aspect we've also applied is this method of data skipping. So it can actually skip large orders, large sets of data and maintain the order of the query so that you get the results back faster. Now what does this really mean in terms of real end user benefit? We're talking on the order of 100 to 1,000 times faster for individual queries. We've seen 8 to 25 times faster for entire workloads. We actually did a demo here. I was excited to do a main tent demo. We showed blue with Cognos with a flash system, 820, running on two nodes of a power 780 and the load time we put in over a terabyte of data, I think it was like 3 billion rows, 20 million dimensions all under less than 12 minutes and as you're running the queries in the product dimensions, it's moving as fast as you can think. It's moving as fast as you're moving the clicker. There was no hourglass, tick-tock, tick-tock, wait for your report. So you put the entire database in flash in that demo? In that demo we actually had a split. We used some flash and we also had some mix of SSDs, but it was mostly flash. Okay, but most of the database is running on semiconductor. And then we also did a demo also using pure scale, DB2 pure scale, because you're really going to see the flash benefits when it's an IO intensive workload, right? First of all, flash will make anything faster, but the order of magnitude on an IO intensive workload, especially around transactional systems, they're going to be just ridiculous. So speaking of databases, IBM made an announcement today with Mongo in the NoSQL space. Can you give us a little color on that? Yeah, you know, this is about IBM's commitment just in general around the open source community and also with OpenStack. And we're adopting and understanding that clients are going to need new capabilities in order to prepare for new types of application development. We've actually added JSON NoSQL support within both DB2 and Informix as well. In addition to that, we also wanted to partner with Mongo because of what they've done in terms of the community and advancements around cloud-based application sets and some mobile-based application development. Yeah, okay, so the Mongo's obviously been focused on that JSON integration for a while, kind of a leader there. So it's a big moment for them. They must be excited. We're actually doing, is it next week still? Next Friday in New York City. We're doing the Mongo DB conference, which would be great with the guys from 10Gen, so Jeff Kelly and I will be down there next Friday. Next one. I'll be driving the cube down from Malboro and it'll be great. Good action down there. You're everywhere. East Coast, West Coast. It's crazy, isn't it? I don't know if people from IBM will be there, but we'll definitely get them on if in fact they are. So that's cool. Okay, good. In he. Amazing. Did we miss anything? Let's see. We covered so much. What's next for you? Where are you off to next? Wherever you are. Yeah, okay. Well, hopefully we'll be at IOD. We'll see you before that. But we've got a long summer coming up ahead of us. Well, nice job. You know, this is always great to talk to you and love the enthusiasm. How many keynotes have you done this week? Four. Four? Good. I've got one left. One main tent, three breakouts and one left. Good. Great. Well, good luck with that. In he. Chosa, thank you so much for coming on the cube, sharing your enthusiasm. I know how you keep all this great information in your head, but our listeners appreciate it. So great to see you as always. Well, thank you very much. All right, keep it right there, everybody. We'll be right back with our next guest, Charles Long, who's the CEO of Centerline. This is a really good story. He was one of the keynote speakers in the morning. And we'll have him on next. So keep it right there. This is the cube. We'll be right back.