 Hi everybody, we're back. This is Dave Vellante with Stu Miniman. Jeff Jonas is here and Jeff is a long time cube friend and really appreciate Jeff you coming on. I know you've been on a whirlwind tour, just ran, or just did an Iron Man, so really appreciate you hanging in there for the cube. Tell us about your keynote, what's going on, what the reaction was from the audience? We're here at Edge, second year in a row. So it was exciting for me on the keynote today. It was the first time it was broadly publicly known that my G2 invention had a role in helping modernize voter registration around the last election. Hundreds of thousands of people ended up registered and voted that otherwise would not have. Congratulations. Yeah, no it's really cool. It's one of the things I like about my job is every now and then I just turn around and look at the effects when you're creating systems like this and the effects are just amazing. And some of the goals of the project, and we worked with Pew Charitable Trust on this, they led all this election research to figure out the focus on the election rolls, but the goal was to increase the quality of the election role and to let states have a better understanding about who's moved, who's eligible to vote and who may have voted. And many people, we're a very mobile country and people just don't know when they move that just registration doesn't follow with them. So it means our roles are incomplete. So anyway, at the end of the process, it just means our election rolls are more credible and it provides more confidence in our election process and more people have access. So it was real exciting. It was fun, I got good feedback. I'd highlighted my G2 invention. It's just the tip of the iceberg of what this G2 is gonna do. So talk more about G2. So G2 is, it's a technology that I dreamt it up. I dreamt it up, you know, I looked at my body of, well, okay, let me step back. Another executive IBM says to me, hey, if you had a big idea, we'd fund it. And I'm thinking to myself, I've built a hundred things in my life. Knowing what I know now, if I could only build one more thing in my whole life, what would it be? And I went, oh, it'd be this, it'd be this, yeah, this would be cool, it'll be useful, because I'm trying to be useful these days, you know? So I went and showed my boss and go, hey, how did I build IBM this? And he's like, wow, you'd build that for IBM? And I'm like, yeah, funny story what happens next. But basically I ended up with people and we started building it. I did a secret, I spent the first year on paper, I spent the next year and a half doing it, basically still secretly, like a Skunkworks project. The first two and a half years the world didn't even know about it. It's designed to take very diverse data sets and integrate it. It's kind of like how do puzzle pieces find each other in the puzzle? And then when you go from puzzle pieces to puzzles, you get to whole pictures. When you get whole pictures, you make higher quality predictions. Man, the range of organizations that are going to benefit from this is just really across the spectrum. It's everywhere. And I think my own personal goal is G2 will one day be seen as maybe the first real context of where computing, like for real. I mean, there'll be some other things bigger, better, someone will have, but I hope it'll be seen as really the first of that, of its kind. It's a general purpose, you know, the work that we're doing with Pew Charitable Trust. I'm also doing work in maritime domain to protect shipping lanes. I'm also doing work in anti-money laundering with financial institutions. And you could use one G2 to do them all, all at the same time, in the same schema, with the same algorithms, because it's actually the same problem. So the Alpha Geeks say that that problem is really hard to solve, you know. It's different data sets and making it make sense. And actually getting the quality to where you need it to be that you can trust what comes out of it. So how did you do that? Yeah, well the first principle is you want the data to find the data. And let me tell you just exactly what this means. For the moment, let's pretend you're an organization, just you right here, an organization. Well, every new piece of data that arrives at you, you just learn something. For any organization, every time, soon as an employee changes their emergency contact phone number, that organization just learns something. If somebody just enrolls in the loyalty club program, they just learn something. So it turns out, every time a piece of data lands, it turns out that is the question. It is the question. If you want to be able to sense and respond. So the very first thing that G2 does, does this little game called Data Finds Data. And it takes the features on a new piece of data that's just arrived. And it says, how does this relate to the previous observations I've seen? And that's like taking a puzzle piece into the puzzle to see where it fits. And I've been doing these puzzle projects. Sure. I've done two with kids, I've done two with adults. I've shown, many settings, I've shown the first puzzle project where the kids were putting a puzzle together. But I haven't shown the fourth one, which was done with four drunk adults. The main thing I learned about this one is drunk people are sometimes unreasonably optimistic. Because one of the people in the project would take a puzzle piece and go, I think it fits and use their fist and pound it into there. That was a big lesson. But anyway, so it turns out there's a very general way that you can figure out how one piece of data relates to others. And I've generalized that problem. The better of a job that you do figuring out how data relates, then the easier it is to figure out what is relevant to who. Now, so general purpose context accumulation is give me the features from your observation space and weave it all together. And then it becomes very domain-specific about how to benefit from it. See, what a lot of organizations is doing, it's like, oh, we want to do social media and sediment. Well, let's build an algorithm just to study that. And then somebody else goes, oh, we want to do an algorithm to study fraud. Well, let's build an algorithm just for that. Well, what if you could actually have an algorithm that allows you to commingle that data and benefit horizontally across it? Talk about variety and orthogonal data. And it turns out the quality of predictions you get when you mix diverse data together goes up. And by the way, that's how you find errors in data. Better yet, it's how you find lies. Can I know more about that or? Yeah, talk more about that. The question is, how do you find a lie? So if I told you I was 30, well, you'd look at me, that'd be your second data point. I'd go, I'm 37, you'd be like, you're lying, just look at you. Okay, fine. But if I said I was 48. I wouldn't believe you. If I said, yeah, I'm from Vegas, it's hot here, I'm just dehydrated. Or what about 30 Ironman do? Okay, so look, if I told you I was 48 and a half, and that's a lie, I'm going to be 49 next weekend. So, but how would you know that's a lie? So the only way you know it's a lie is if you introduce, you have to have a secondary data point. You have to have something to contrast it. So this is a really interesting thing about what these sense making algorithms, because I think G2, the way it'll probably present itself in IBM to the world would be under the brand InfoSphere sense making. But it turns out errors in data, natural variability is your friend. You actually want it. You don't want to polish every puzzle piece to perfection. I'm not talking about MDM, where you're getting a chance to onboard somebody and you better have the account balance right, you better have the address right, and that's your chance to get it right. He's talking about inferring something. Right, I'm saying now you've got your data that you can own and control those gold records, but now you've got data that you don't own and control. It's data that you're getting externally, and now you're trying to bring it together and it might have lies in it. Well, this natural variability, spelling errors, transposition errors are really your friend. My favorite example is when you search Google and it says, did you mean this? It's not looking at a dictionary, it's remembering everybody's errors. If it didn't remember the errors, it wouldn't be so smart. And I got a little personal story on this. My youngest son, his name's Dane, he's born and I get his date of birth wrong. I forgot his date of birth. I convinced mom, now we teach him his date of birth and it's wrong, and everywhere we register his date of birth it's wrong and it's off by a couple of days from real date of birth until he's five. I order his birth certificate and I get his birth certificate because I want to take him to Mexico and I'm quite depressed because I can see that his birthday is wrong. You got to go to your kid, bad daddy, this is a bad daddy story. Okay, fine, this is a bad daddy story. I go to my kid and I'm like, hey, I got your birthday wrong. I know he taught you birthday, he looks pretty defeated, but I had it all set up, I had a PR line on it, and I'm like, look, it turns out you're a little older than we thought for a kid. That's just great. Five days. Yeah, he was two days older, you're two days older than we thought. He just thought that was fabulous. But listen, imagine that. Any smart system would have seen his one date of birth. And the very first time I introduced this new date of birth that has never been seen across any channel. Well, of course that would be wrong. So what would you do? Any good system, and stuff it out. Well then what, then I present it again, but guess what, it doesn't remember it's building up because you got rid of it. So it turns out in smart sense-making systems you have to let descent fester. And that's actually helpful in finding lies and deceit and data. But can you tell us more about this money laundering? I mean, obviously this is something that's relatively new for you, and it's applying G2 to that problem? Yeah, I carefully pick my battles these days because, well, I'm a curious person. I stir up all kinds of stuff, right? But I got to be careful when I stir up. And so I advise in lots of areas, but now and then I actually pick a real horse to ride. So I actually spent three, four years on this voter registration. Really deep diving into it. Well it turns out with financial institutions they have these, they have software that they buy that detects money laundering and it produces leads. Well the problem that the financial institutions are facing is it will produce a lead. Here's a machine, it makes a lead. It goes, hey I got a lead, it gives you the lead. And you go, wow, I got a lead. You chase it down. I don't know, maybe an hour later you're like, well it turns out that's not a lead. But the machine finds another, here's another lead. Well it's doing this to hundreds of people maybe, you know? It's giving all these hundreds of people leads. I go in and talk to these people and guess what? How long have you been working here, three years? How often, when's the last time you got to ring the bell? All these leads, aren't you excited? Ooh there's another lead. When Gary leads. These are leads. Well guess what? It's like, years, I've been working. It's hard to keep the morale up. So it's really like a false positive engine. I don't know, maybe it's better go random. I'm probably exaggerating. It's probably not better. It's like throwing darts into the Wall Street Journal. It's like, well there's one. So I deep dived into this and I've really, I've been meeting with analysts and I've been work, I'm really putting enormous amounts of time into this. I've actually personally written out so far 180 page technical document including pseudocode and schemas for my technical team to do something that I think is really going to significantly move the needle. And it's just a use of G2 and the spec really just describes how to prepare the data space in a way to feed it into G2 to get a really cool result. And the end result really is going to be allowing the quality of cases that organizations are going to get are going to be higher quality cases. When you open up a case, how you choose what transactions to look at in the case, the order of those transactions will be way more interesting than what an analyst could have stumbled into. So the quality of case works going to go up and it'll even take less time per case. So you get higher quality per case less time. That's a really interesting phenomenon about context accumulating systems. We're going to see lower false positives and lower false negatives at the same time. I mean today what people battle is you move the needle. It's like you move the needle over here, you get bit on this side, you move the needle over there, you get bit on that side. I said it's whack-a-mole. Yeah, so it's going to change. I just, general quality predictions are going to go up. You're going to catch more false negatives of the things you're missing and you're going to find more false, you're going to find more false positives and get them out of it so you don't have to waste your energies. No, great conversation Jeff. So you set the high bars, what you said for yourself. Where's IBM, where's the high bar IBM setting for some of those big audacious problems out there? Well, I'll tell you what, after IBM bought my company, I roamed the labs. I traveled around the world and went and poked my head into a lab and I would go in and share with the lab what I do because I was just the new thing that IBM bought and then they would show me the stuff they've been working on and two things really stood out over and I saw lots of really neat stuff but the two things were really close to my space. One was this thing called InfoSphere Streams. It's super low latency, decision pipelining engine. If I'll tell you what G2 is to streams, if InfoSphere Streams is the nervous system that connects your eyeballs and ears to the hippocampus where you weave the data together to see how things relate, then that's the nervous system and G2's like the hippocampus. So anyway, I had a lot of affinity to that and I've designed my G2 thing to run in that. And the other one, before it won the game Jeopardy, I was talking to Dave Ferrucci, the principal investigator and he was telling me what he did and I was not hearing the same stuff I kind of always hear in the same space. The way that the problem was attacked was different and it was different in a way when I went, you know what, all of my spidey senses, you know, after all these hundred of things I've built over the years, I went, you know, that's really an innovative way to attack that. So I have really high hopes for that as well. For Watson, yeah. Yeah, for Watson. I'm also really fascinated with Flash. I'm excited to see IBM starting to make some really big bets on that. I really, I like all my systems that run on it. All right, Jeff, hey, we got a run. So thank you very much, really appreciate it. Good luck. Yeah, thanks. We'll see you hopefully on the other side of August. All right, Jeff Jonas. Keep it right there, everybody. We're right back with our next guest. This is theCUBE, we're live at Edge.