 Okay, we're back live here at Strada. I'm John Furrier, the founder of SiliconANGLE.com, and we are in theCUBE, and I'm here with my co-host. I'm Dave Vellante at wikibon.org, and we're here with CUBE alum Amy O'Connor from Nokia, who's a data analytics whiz, runs a group inside of Nokia. So welcome back. Thank you, thanks very much. It's nice to be with you guys again. Yeah, we had Amy on last Duke World in the fall. Seems like quite some time ago. It was just November, down in New York City. That's happened since then. Yeah, here we are again in Strada our second year. So yeah, what's changed since we last saw you? Okay, so first of all, what's changed is, I brought with me my Lumia phone, which I'm incredibly excited about. Look at that, isn't that really? Beautiful. Really sweet. Check it out, can you? Eye candy for the camera. But, you know, we're here to talk about analytics, but my phone makes me excited, so I thought I would show you that guys, that as well. It's gorgeous, feels good. Yeah. I think we need phones for all the CUBE alums here. We'll work on that, we'll definitely work on that. But a lot's happened, so when we talked in the fall, I kind of was talking in three areas of work that we're doing. We're building a centralized data asset of the important data that Nokia's got to kind of bring the company into an area where we make ourselves much more data-driven. And we've done a lot of work since November, I think in three different areas. One is on the technology layer itself and the technology that we use to support this data asset. As you guys know, especially coming to conferences like this and talking to different people, how much the technology is changing over time. So we continue to evolve our ecosystem. In particular, we've done some really interesting work in the past couple months of actually integrating our Hadoop environment with our Teradata environment. We have a, you know, what you might call a legacy data warehouse that's built in Teradata. It's got some really phenomenal assets and views built for folks to do insights there. So we've done a lot of work on some of our big data to process it and manage it and aggregate it and then move those to Teradata. So we're in that world now that people talk about of bridging the big data world with the traditional IT world and how do you actually do SLAs across all of those? And I think from a technology perspective, that's one of the biggest strides that we've made in the past few months. Yeah, so today the Hadoop is all new, right? Relatively new. Right, right. I said to Mike Olson, it's like the tail wagging the dog. Do you see that flipping over time? And he said, it's certainly in terms of the bytes captured. But how about in terms of the value? What do you predict there? You know, the value is in the end use of the data. But the world is changing in that the end use doesn't define what you do with data these days. So we're really trying to collect all of the major data assets of Nokia because we think we can come up with new uses. So I actually see the value in a full, we call it a full technology ecosystem. The value is in the petabytes of data that end up in Hadoop, but then they get culled down and aggregated down to terabytes of data and then gigabytes of data and handed off to users that have data marks and recommendations engines and data that flows back to these phones in particular. So with the traditional data warehouse, the sort of single version of the truth was always what we were chasing is, have we given up on that? Because it seems like we're quite far away from that now with the dupe coming in. And is that okay or am I off base on that? There's a lot of places where you need single version of truth. You certainly need single version of truth in finance. You're right, all of that. But a lot of the rest of the world is quite frankly changing too fast for there to be a single version of truth for more than a moment in time. And a lot of the use cases that we're talking about are use cases that are constantly evolving over time so that as we get new data and new data sources and we learn new things and we use machine learning algorithms, we don't even know what truth we're looking for. So I think the single version of truth for parts of the world and those are the parts of the world that are around contextuality and user experiences will constantly change. So what about, let's see, in the last year, John, the number of distributions I saw Fujitsu announce the new distribution yesterday or the other day and it's exploding. Does that matter to you as a technology practitioner? Do you just let them come and say, hey, we'll try a little bit of everything and is it really not a big technology issue, more of a people in process issue? Talk about that a little bit. Well, I always say that the toughest thing at work is the people part of it. So clearly the most important thing for us to get right is the people and the skills and the ability to use data to do new and interesting things to drive higher values for the business. But we're challenged by the technology. As you said earlier, this is all new and it's interesting when we see new people coming on board with distributions. We look to vendors to support us. We look to, I mean, we don't want to spend our time down in the bits and bytes of Hadoop distributions. So we look to a really good partner to do that so that we can focus in the areas that add value to us. Now I think we've kind of all seen these worlds where there's some code in open source different people pick it up and do different things with us. I hope and expect that all these vendors are putting the best of the best back to open source so that the entire world can keep evolving. And that's really what we look to. So Amy, talk about your experiences for the folks out there dealing with Hadoop. We're hearing, obviously the show here, the standards are, it's being talked about Hadoop is now standard and Michael Olson was just saying, you know, hey, applications are surging in. You've been investing a lot in the development side of the framework, the methodologies around the big data, what's your experience with Hadoop? Where are you now? What are you using? Can you just talk a little bit about some of the details of with Hadoop? You know, we use, we use probably pretty much everything that's in the standard Hadoop distributions that we actually pick up through the Cloud Air distribution. So, you know, pig and hive and we're really, really moving a lot into the H-Base world. We have a lot of different application uses for H-Base and I can talk about some of those a little bit more if you're interested. And you are using H-Base. Yes. We use Uzi and ZooKeeper and Scribe, you know, you name it, the alphabet scoop, we scoop with the best of them. And then kind of, oh yeah, we've got pigs. Yeah, we've got all of that. And then kind of around that world, we also use a lot of, you know, we use Ganglia, we use Nagios, we use around the top of that world, we have a sequel-based world where we do a lot with Oracle, a lot with MySQL, and, you know, as I mentioned earlier, a lot with Teradata. So I think you name it, we use it. Nokia's a huge company, 60,000 employees, spread all over the world. So we've got a lot going on. How is your experience with that product? I mean, as you're developing, good, bad, ugly in general? With Hadoop? Yeah, it's been good, positive? It's been very positive. I would have to say, given the newness of Hadoop, you know, the fact that it's really only been around for a few years, that it works quite well. We do rely on our relationship with Cloudera to help us quite a bit. We just even, you know, there was just an instance in the past couple weeks where some of our MapReduce jobs were counting on certain things on the platform working certain ways and we put a new version of the platform in that broke some things that these jobs probably shouldn't have been doing. And Cloudera was right there with us on site to help us take care of the issues. So you're using Cloudera for support and you got the license as the enterprise addition? Yeah, yep. CDH license? Yep, yep, CDH license. And I think even in those cases, it was less of a, it wasn't an issue with the distribution or what we were doing from a platform perspective, but I think the places where we're all learning the most is how do we use MapReduce? How are these jobs running? You know, how do we lay them out? What's the, we're constantly trying to profile the jobs themselves to say, you know, what do the compute cycles look like? What data are they touching and all that? And that's an area where we're spending a lot of time now. And you support analytics for Nokia globally? Yes. So how do you deal with all the requests that you must be getting? Do you advertise your capabilities? Do you hide them? How do you manage that whole process? It's a good question. We're at first stages of that. And there's a few different things that we can look at that have happened organizationally inside Nokia. We've got a very strong and good IT organization and they're the ones that support the Teradata data warehouse, for example. They've got some really strong processes and support in place for that single point of truth in the data. So we're partnered with them and we kind of are building the big data side. Well, that being said, we also have many, many very smart people around the company that have pulled down Hadoop distributions and built their own clusters. And part of what we started doing in the past year was to reach out to different folks to say, we really want to build a single data asset and here's the business value of doing that. And what's really turned around in the last couple months is that everybody's brought into this concept of big data and the value of big data. So what we're doing right now is we've taken a step back and instead of focusing on ingesting every data source that we can see into our centralized cluster, we're actually in the process of doing inventory across the whole business. Who is creating data? What is the point of creation? And then who right now is consuming data? And in many cases, it's a single application silo that's consuming data. And maybe for the next year, it's only going to be one application that needs to consume that data. So we push it a little bit lower on our priority list to move it to a centralized Hadoop environment. But then there's cases where there's value of the data across many different application environments and many different analysis environments. Are you classifying those initiatives on the point of creation or can you? Is that something that you can see doing? We're just starting to. Can you automate that? Or is that in that way? In that classification. In other words, you're taking the inventory now, figuring out, okay, what's going on? What's being created? Are you going to actually inject some metadata around that creation so that you can leverage it downstream? It's a great idea. We'll bring you on to help us do that. If you can automate that, then you can really scale it. Yeah, we're kind of at brute force layer right now, but I think we'll get to different levels where we definitely need to do that. Okay. So what were you guys at? So Michael was talking about the application explosion, platform stability. You guys, you got the platform up and running. You're building on that. What apps are you looking to around the corner for as you guys have your vision and your roadmap? Yeah, so really. The big data. Don't mean necessarily Nokia kind of apps. No, no, no. Yep, absolutely. Some of those Nokia apps are become big data apps because what sits on your phone, our drive application, obviously wants to consume contextual data about your world and recommend a better way for you to drive or where you might want to stop along your drive to do different things. Maybe you need gas and we can actually figure that out for you. So the apps themselves will consume data, but there's multiple different pockets where we're using big data. One is just in what people think about in standard business insights themselves. And an example there is we purchase a lot of data from third parties. For example, we have a very, very large places registry which where we document all of the points of interest around the world to the tune of 70 million or so points of interest around the world. And we can actually use big data to determine how are those points of interest being consumed by users on the phone? And maybe some of those sources aren't worth us purchasing. So there's simple things like that where we're doing insights. And another example is we're ingesting the logs from our Hadoop environment itself into our Hadoop environment. And we're analyzing how our Hadoop environment is working to say which jobs are running on which sets of data, where are the hot sets of data and all of that. So those are kind of two interesting things. Analyzing the analysis. Exactly. Yeah. I mean, actually a personal question is you guys having a fun time with us or what? I mean, sounds fun. You're doing a Hadoop for Hadoop. Using Hadoop to make Hadoop better. Yeah, absolutely. Yeah, that part's super fun. There's a few days where it's not so fun. I'll admit that, but for the most part we're having a blast. So inside the company now, this is more of a Nokia question. Obviously the trends that we've been talking about here in theCUBE for over a year and then just recently here with virtualization, with flash, it's all new, this enablement, okay, et cetera, et cetera. But really the big mega trends has been mobile and social and this cloud computing trend which we've been covering since Silicon Angels this year, doesn't deny. You guys are in the mobile business. So you got to double win there. More pressure, is it moved up to the top of Nokia? Is it clear to the company that in order to be good with mobile, you have to be good at big data and be good with other people with big data? And how does that, how's it shaking out with the Nokia? What's the buzz internally? The buzz is data, data, data, data. We have a long way to go to really turn that buzz into a lot of impact on the business but in the past six months the buzz has totally become data. And just to, my calendar, I'm triple booked every hour. I can't read all my email. Actually I need to get somebody in my group to do an analysis on my inbox to figure out which emails I really need to respond to. Which meetings to blow up. Yeah, that part's so cool. Data analysis, well that person's influence isn't as good as it used to be. So what was the buzz before data? Like the last big buzz, was it product, what was the, was it all over the place? So I haven't been at Nokia long enough. I think when I stepped in, which was just about a year and a half ago now, it's been data, it's been data for that amount of time and the buzz just is louder and louder every single day. Yeah, I mean the reason I ask is it feels like a lot of corporations, the buzz is data and it used to be a lot of different things, whether it was a go to market or a new product or whatever it was. Well one of the things that we've talked about how we've changed over, you know, Nokia's a 150 year old company and it started as a wood grinding plant on the banks of a river in Finland and then became a rubber manufacturer in Finland and then became consumer electronics and so one of the things that we talk about is it's always been a company that took raw materials and produced products and we now believe we're kind of beyond how we build this phone itself and we're into taking data as our raw material to build new products. Awesome. All right, Amy O'Connor, well listen, thanks very much for coming back on theCUBE. It's always a pleasure to see you. We'll see you back in Massachusetts. Yeah, thanks for sharing your use cases with us. That's great, people are trying to figure out what to do with it, a lot of questions is creating a services boom right now and we have servicesangle.com, a dedicated section within SiliconANGLE, dedicated to services because we use our big data to predict last year that the services industry is under massive growth because of the side effect of big data. And you would know something about that, like former sun services. Are you an alpha geek by background or a business person? I'm both, I'm a dual headed monster. All right, dangerous. You don't have to invite on theCUBE anytime, thanks for sharing your knowledge with us.