 Live from the Mandalay Convention Center in Las Vegas, Nevada, it's theCUBE at IBM Insight 2014. It is your host, Dave Vellante. Hi, welcome back to IBM Insight everybody. This is Dave Vellante with John Furrier. We're here with theCUBE. theCUBE is our live mobile studio. We go out to the events. We extract the signal from the noise. Carla Gentry is here. Otherwise known as At Data Nerd. Carla, great to see you. Welcome to theCUBE. You are a data scientist. You have your own company. We were just talking to Dr. Ahmed Bulut from a university in Istanbul. And he said, well, it's data science. It really isn't such a thing as a data scientist. And so he and I were arguing a little about it. So I said, come back and see Carla. You're a data scientist, right? Well, you know, right out of college, I started with RJ Crumman Associates up in Chicago. And that's what we all were. A bunch of data nerds in there playing around with terabytes of data before anybody even knew what a terabyte was. Back when a terabyte was really big, right? Right, back when a terabyte was big data. But, you know, gleaning insight for Discover Financial Services. And then, you know, I've worked with consumer package goods, education. I mean, it's been a wonderful, wonderful career. And what's so great about this is to be able to walk around and see how much data is a part of more people's lives now than it was 20 years ago. I mean, 20 years ago, you couldn't have, you know, gotten thousands of people together talking about data analytics. Well, you know, the interesting thing about what you're saying with all your CPG, education, financial services, John and I talk about this a lot, how the data layer is becoming a transport mechanism to connect the dots across different industries. And data scientists, you guys don't like to get locked into one little industry niche, do you? You like to gather data from all types of different sources. Talk about that. Well, that's the thing. Unfortunately, we get bored very easily because we like to have our fingers in a lot of different pies. But you wouldn't want to be necessarily siloed with just one kind of information because curiosity makes you think about everything. Education, risk, you know, that way I have no walls. You know, I can glean insight from any type of data. If you've got a database, we can jump in with both feet. Is data, and why is data more transformative today in this day and age, you know, circa 2014, versus say when you came out of college? Why is it that everybody's talking about data, that data is able to change industries, transform industries, what's different? Well, now the, you know, data can actually give you, you know, an insight into your customer. I mean, you know, what is your customer buying, you know, so when you go to, you know, run a campaign or something like that, you're not shooting in the dark. You know, you're actually, you have a face to your customer. So, you know, you can make decisions and it's not just marketing, you know, which is what I started out in, you know, trying to increase and lift, you know, sales, but now, you know, you have risk, you have, you know, data breaches, you have, you know, what keeps CEOs up at night? You know, it's not only the cash flow, you know, it's the mitigated risk that's involved and when you're looking at your data and you're collecting this information, that gives you a view into what's really going on. So, you can sleep at night and have a little bit of comfort. Most CEOs say- Well, we're not sleeping at night. Well, right, right. For a couple of hours of sleep, I'm putting a notification- Sleep with one eye open. No, but this is a good point. CEOs and CIO, CFOs, Chief Data Officer, you're seeing much more formal roles around data, where data is the key asset and this is awesome because it brings to the forefront the role of data. And so, I want to get your perspective on this. You brought in kind of the trajectory of where we've come from. And talk about the role of software because really what this highlights here at IBM Insight is, okay, it's not just data per se, you now have software that's a key part of it. So, it's now also a integral part of the platforms. You have a developer angle. You have the data asset. And now you've got this real time in the moment experience. And IBM is talking about engagement a ton here. And so, what's your take on all that? I mean, it's exciting, certainly if you're in the data business. Well, definitely. I mean, real time data, of course, is very expensive, but it's more attainable now than it ever was. The thing is now is you don't necessarily have to be a data scientist to be able to go and get at your data. I mean, thanks to software tools like IBM, they give you that benchmark or these tools where you can use BI and things like that to be able to get a view into your business. And it's not just for your analytical department anymore. So I think it's what it's done is it's actually made it more attainable now. It was like, people looked at DataWag back then, oh, it was so scary. But now, it's bringing it to the forefront to where we can make decisions. We can run our business better. And I joined forces with Revo Software years ago to look at the supply chain. Now, when you talk about that, that's what keeps the lights on. Yeah, the lights on, but you're only as strong as your weakest link. So when you're working with third parties, you have to make sure that everything is going smoothly. So I want to get your take on a couple things. Inhe Chosa was on earlier, and she's an awesome guest. She's been on many times. She's dynamic and articulate, super smart, brilliant and beautiful. We love talking with her. She said, I asked her, what are the top three customer issues and investments? Kind of a double-edged question. She said three things. Customer experience, operational assets, aka the supply chain, and then risk, security and governance. And then we weaved in context computing and then cognitive. So let's break that down. So customer experience, Internet of Things is a data play, probes and sensors and machines, certainly got that. Wearable technologies. People are things. Yeah. Well, here's the thing that you think about data. Data is a person. That record that you have in that database equates to a real-life person and you want to, you're not going to be friends with your customers, but you want to know more about them so that you can serve them better. For me, the biggest thing is people will go out and spend millions of dollars on a database, but not necessarily know what to do with it. So it comes down to what question are you trying to answer? Yeah, and the infrastructure fees is interesting because you want to have that agile flexibility, which is kind of a buzzword among spenders. Hey, be flexible, but there is meaning behind it, right? So context computing, it's relationships across entities. The streaming stuff is very, very interesting to me because now you have streaming data coming off of devices, again, brings up the real-time piece. So making sense of all this means it puts it in the forefront. And what you can do with that data is if you do have a client or a customer and you let them link in socially, like log in through Twitter or LinkedIn or Google, Facebook, now you can append that social data. So now you've got an ideal sediment when you're positive. It's first-party data. Yeah, exactly. And that's the holy grail of active data, it's first-party data, which we love because we love the crowd chat, we love people logging in, and thanks for, by the way, for hosting the crowd chat with Brian the other day. It was really a fantastic conversation. My pleasure, my pleasure. Let's talk about cognitive because this brings a human element of it. One of the things we've been teasing out over the past couple of shows we've been at around big data is the role of the developer, where the developers in the old days, from even going back to the mainframe days, they're cobalt, they're in these rooms, almost like, there's almost an image of coders in the back room coding away. But now with the customer experience front and center with mobile infrastructure, the developers are getting closer to the customer experience. And so you're seeing more creativity on the developer side with the use of data. Could you share just observation, anecdotes, things you've been involved in that can tease out where this is going and how people should be thinking about it? Well, 20 years ago, if you tried to show someone and graft with 16 different things at one time going on, they were like, that's messy. Now you can actually find the sweet spot where everything interacts. So when you're talking to an artist, a digital artist who's working with data and giving that picture, that's exciting for me. And going back when we were talking about cognitive computing, when you're talking about the Watson on ecology, that's exciting. That's the highlight of my day. It's almost magic. It's almost like black magic. This Watson stuff. And people are really just now getting their arms around that. And that is essentially making sense of the data. But that's the thing. See, it's no longer magic now. That's what they thought 20 years ago. People like me, they kept in little closets in our office and they only came to us when they needed something. Now we're an integral part and we actually are in the business development meetings and we're a liaison between the IT department and the C-suite. One of the things that is interesting about your role is not only you out in the field doing some great work, you're also an influencer here at the IBM influencer program. So I want to get your take on this balance between organic data and kind of structural data. Organic data means free-forming unstructured data and then existing data that comes in that's rigid and structured because of business processes. I get that. This is data warehousing. Business has been around for years. Business intelligence. It's all fenced in, all structured. But now you have this new inbound data sources coming in, being ingested by these large systems. Data changes the data. So you now have a new dynamic where latency, real-time insights, these are the new verbs. So talk about that role, the balance between organic data and the structured data and what the opportunities are. Well the wonderful thing about, now that unstructured data was scary way back in the day. So now it's not so scary. Now we can actually take this data and make business decisions. But like social data and things like that when you can add that and append that and get what we all want is a better view of our customer and to be able to do business with them like supply chain management and things like that. I mean you're looking at open people collecting information from varying sources and this all has to be put together. So I think they mentioned earlier this morning how 80% of it is where data janitors cleaning up this, that and the other. Whereas what we really want to do is glean the insight from it. But I think the tools these days are making that much more easier no matter what the source is that we can actually put it all together. What we used to call the merge purge back in the old days. The merge purge, yeah it takes weeks to do the merge purge. Yeah, who all here knows what a DLT is? I mean it's showing us. You've been trying to solve this problem for a while with traditional technology. 17 years, yeah. So let's talk about the promise of BI and the traditional data warehouse. 360 degree view of my customer, real time information. And that's what it's about, it's about drilling down. Predictive analytics, all these promises. Did the data warehouse live up to those promises in your view? Well initially maybe not. But things are, it just seems in the last few years that people have had an epiphany of how this is really adding value to their company. Now back in the old days they all knew that insight is wonderful. But now you can see it visibly showing signs of actually making a difference in companies. So they can keep an eye on everything that's going on. Going back to what keeps CFOs up at night with the risk and stuff. There's still always the risk but at least now you can get a little better handle on it. And thanks to the age of technology and the data that we have accessible to us today and the tools we have available to us today it's made a dramatic change. So what are the technology catalysts? Is it a dupe? Is it no sequel? Is it, what are the tools that are sort of the foundation of that change? Well I think always the new tools and making it so that you don't have to go out and learn sequel. You don't have to be a programmer. You don't have to go to college for four years and learn mathematics and engineering to actually be able to work with this data. So thanks to tools like Hattop and other tools I mean you can really sit down and glean insight without having to write one single line of code. So the things we're getting some questions on the crowd chats at data nerd. What are the key things that are messy, scary right now for CEOs and CFOs? So things are becoming less scary. What is the scary things right now? Well the scary thing is the breaches. When you hear about Target and these big names people getting access to your credit card data that's scary. So we've got to really try to lock down that risk. And I know everybody's scrambling scratching their head figuring out how we're going to keep these breaches from happening again. Yeah and big data solves that. I mean you have big data technology which is a combination of machine learning, streaming. We're getting massive surges of data coming into these ingest systems where you're going to apply some reasoning to it, some cognitive, some insights to look for the patterns. And that's where machine learning shines. How do you see that aspect of machine learning and these new tools affecting that kind of analysis? Well I see it opening up a lot of different doors for a lot of different people and making a difference because everybody knows that data is important but not a lot of people know how to deal with it especially when it gets into the zettabytes of data. When you have tools like the IBM tools that can handle this type of load and be able to give you instantaneous information and like what we saw this morning where like risk, I mean in oil and gas industry you have to worry about is someone going to get injured on the job and they showed the sensor whereas she walked toward it, it went off. I mean the internet of things being able to let us know in real time if there's a danger to personal life or to your database and then predictive to be able to say well this is what we think's going to happen in the future and to be able to move and act on that. It's a very exciting time. Well you mentioned IBM. So obviously IBM's the leader in here. Jeff Kelly's report shows IBM's and the number one big data player but big part of that is IBM's so big, right? Well and you guys were around a long, you've been around a long time. Yeah, right. You guys were playing with big data way back before big data was big data. Well so, yeah. We have been. We have. Us guys, yeah. Well social data in particular. So those guys, right? So we're not, right. Those guys. But so you bring up IBM. A lot of people have a perception of IBM big, hard to work with. But you're mentioning. But that's changing. So talk about that change. What I'm excited about is the Watson's analytics. I mean that in itself right there made me set up and get excited about the data world all over again to be able to. What excites you about Watson's analytics, the platform? Well I really like the Oncology Watson. They had the one for the, not necessarily for the police, but for the crimes. I mean in real time if you can see that a crime is about to happen and you can prevent it. Or if you see someone's health is failing and you're able to step in. That's why over there earlier I was talking about IBM cognitive abilities can save lives. So I mean my mom passed away from cancer. So the Oncology Watson was very exciting to me. But it's going to make a difference. And I think the thing is now is that how it's changed is to make them user friendly where you don't have to have a data scientist or an analyst to come in. They talk about how expensive data sciences are. Now the reason I opened my business was to make it affordable to small businesses. So although people look at IBM and think it's scary, I think they're going to see now that the direction that they're moving is becoming more user friendly and more available. So Carla, I wonder if you could talk about how you engage with clients. You mentioned small business, right? Because a lot of businesses, small mid-sized companies don't have the resources. Right. So where do they start? Do they start with a call to you? And what do you tell them? Most of the time it's a call where we spend all this money on this database and we still can't get what we want out of it. We're not getting it all. So it comes down to what question are you trying to answer? I think that's the most important thing because that directly deals with what data that you need. And if you don't have it internally, can we get it externally? Can we go through open source? Can we get census data? Can we get, you know, work with hospitals and doctors and things like that and use this to be able to feed this information into them to make a difference. So what do you do? I mean, the CEO calls up small companies and says, I got all this data. It's unstructured. I got some social data. I got my customer data. Trying to make sense out of them. Trying to figure out who's ready to buy, where I should be, you know, focus my products. And I got all this data. I don't know what to do with it but I know there's some gold in there. I know there's a signal in that sewer. That's all data mining. Right, so how do I get it? How can you help me? Well, it's gap analysis. First off, I would come in and I would sit down and first of all, I need to see what variables you're collecting. If you're telling me you're collecting your name, address and phone number, but you want to do a predictive model, we can't get that one here. So, you know, the question that you want to answer is most important. Are you wanting to increase your sales? Are you wanting to get your, to know your customers better, to be able to, you know, service them better? Like in the healthcare industry, you know, you really want to know what's going on health-wise. You know, so I sit down with them when we do a gap analysis. What are you missing? What do you have? How can we get it? What do you want? Where are you at? Yeah, and here's what you're missing. How do we get at that? And that oftentimes starts with data sources. Exactly. All right, so then you go get the data sources and then what, then what do you do? Well, then we merge it back in and here's the thing. You have to have that way to connect them. You know, the relational databases will always exist to where you have, you know, client information here and you've got other information over here and you have to always bring that back together. So, you know, it's a wonderful time now. But you're a data hacker, in a sense, right? Is that fair? Well, in a complementary way, I mean, hacking is about exploration. Yeah, exactly. Right, so I mean, so you have the skill sets as a data scientist to pull all this data together, analyze it, and... Well, you're going to bring in external source and when you bring it externally, you want to make sure that you can match it back up again. Now that's important and without a unique quantifier, how do you do that? And that's why when you see databases with all these little arrows and everything pointing to where things belong, I mean, we have to be able to pull that in to make decisions. Yeah, we were talking with Franz yesterday, too, another influencer. We were talking about this particular point. He was XP and G back in the day, which is very data driven. Of course, they're well known for their brand work and certainly on the advertising side, but they're quantjocks over there. They love data. They're data nerds over there. They're geeking out on data. And he used to say that the software would cut off data points that were skewing way outside the median. And so they would essentially throw away what are now exploratory points. So this kind of brings up this long tail distribution concept where, okay, you can get the meat of what you want in the head of the tail distribution, but out into the long tail is all these skew data points that were once skew, standard off the standard deviation that are now doorway. So we're old enough to know that move with Jody Foster with contact where they find that little white space and they open it up and it's a huge puzzle. That's the kind of things that's happening right now. So are you seeing the same thing? Well, yes, yes. I mean, the thing is a lot of people don't necessarily have the information that they need. So they're seeking it. And they're going to what avenue? Where do I go to get this data? And thanks to open source and things like that, we've been able to get more information and bring it together than we've ever been able to do before. And I think people now are more open to analysis where it's not necessarily a dirty word. It doesn't necessarily mean you have to go out and spend $300,000 a year to hire a data scientist. You can sit down and look at what you have and I have someone else mention that take the people that you have that know what's going on with your company. They may not be data scientists. They may not be analytical, but they have insights. There's more of a cultural issue now around playing with data in an experimental sandbox way where you don't need to have the upfront prove the case and then prefabricate systems. You can say, I'm going to do some stuff and just for instance, bring in data sources and play with the data. Well, and you mentioned, you know, outliners. I mean, everything when you look graphically at data you expect everything to fall within this little bubble, this, you know, this thing. But when you see, you know, all these outliners going on, for me, usually that means a mistake. So, and if it's not a mistake, it's something that calls attention. So it's definitely not something you just want to toss aside. Talk about creativity because creativity now becomes, you know, an aspect of the job where you got to be creative where it's not just being the math geek or being super analytical. You have to kind of think outside the box or outside the query, if you will, to do the exploration. What's the role of creativity in the new model? Well, before I think that we always thought of ourself as just being, you know, matter of fact, you know, just the facts, please, you know. But now, you know, you can look at things visually and see, you know, it is an art form to be able to find that sweet spot in the data. And, you know, before, you know, years and years and years ago when you would take something like that to a CEO, he would say it was messy, you know. So now you get that creative side where you can actually make things visually attractive. And I think that's important to people too because it's not just data, it's the way you present it. It's also the mindset of understanding messy is a good start. Start with messy and then clean it up. Start with messy and clean it up. Versus getting the perfect answer, as we were saying using it with Bapapachiana earlier about don't try to hit the home run right away. Hit a few singles using the baseball metaphor given the World Series going on. So, totally awesome. But I want to get your final thoughts as we wrap up the segment here on the practitioners out there. What should they do? So there's an approach to the job now, right? So there is a shift and inflection point happening at the same time. What advice would you give the folks out there who's saying, oh, I love Carla's interview. I want to do that. I just don't know where to start. What to do? How do I convince management? I want to get going. What would you share for advice? Well, it's the platform. I mean, you know, think about the foundation of a house. Now, if you have a strong data foundation, you can build on that. It's just like your house. If you have a weak foundation, your house is going to tumble down. So if you have a strong foundation with your data and everything is built right. Now, when I say built right, it means what are you trying to do? What are you trying to accomplish? You know, if it's risk, then you need to be looking at those factors. You know, how many people have been hurt? How many people have been injured? You know, how many people died? You know, I mean, how many breaches do we have? You know, so it starts with a question. What is it that you're trying to accomplish? And then you go from there and collect the right variables. So don't wait, you know, a year later and call a data scientist and go, and I've spent, you know, millions of dollars on this. I'm still not getting what I want. So think about it initially in the setup and, you know, be involved, involve your analyst, involve your data scientist. Make sure that they're in your business meetings because we're the liaisons between IT and the C-suite. Yeah, and that's the key roles. Team is a team effort. It's a team. It's a collaborative. We heard from a med earlier, pair programming, pair, not pigs, it's an accent, pair programming. Work and pairs, buddy system. This is really a true team effort. Well, I always said, you know, I am a team, a data scientist can write programs. We can glean insight, but the team part has to come from working with IT and working with your C-suite. So very much agree. It's definitely a team sport. College entry owner and data science and analytical solutions, influencer here at the IBM, special presentation, second experience, second screen here in the social media lounge. Really doing a real innovative social business. Again, activated audience. You're an influencer, but also you're really a subject matter expert. Thanks for coming on theCUBE. Really appreciate it. And thanks for hosting the crowd chat with Brian Fonzo. It was really good content. This is theCUBE. We are live here in Las Vegas extracting the cedar from the noise, getting the data and sharing it with you. I'm John Furrier with Dave Vellante. We'll be right back after this short break.