 EMC World, live on theCUBE. I'm Jeff Kelly, Wikibon's Big Data Analyst. I'm here with my good friend, Yves Demonsha. Did I pronounce that correctly? You did, hello Jeff. Good to be back on theCUBE. Good to have you. It's your second time on theCUBE with us. It is, it is, yes. So, yeah, we were straddled last time. A lot has happened in the Big Data World between now and then. So why don't you kind of fill in our audience a little bit about talent and kind of where you guys come from, and then we can kind of talk about the role of data integration, et cetera, in the Big Data Movement. And I'd also like to hear about your impressions of the show here. Wow, lot, all right. Well, you know. A lot to fit in on theCUBE. Three months is a long time in the Big Data World, especially since Big Data is open source driven, and open source is a factor of innovation, so everything is accelerated. So what has happened in the world of Big Data and in the world of Big Data integration, especially because this is what we are dealing with, we actually came out last week with the GA version of Talent Open Studio for Big Data and the talent platform for Big Data. And as you know, Jeff, we are trying to democratize Big Data. It's not an easy feat. And Big Data is very complex technology. Of course, you know, Hadoop makes it simpler than having to write massively parallel programming on your own, got MapReduce, which is a terrific framework for that. But it still requires some fairly advanced programming features. I mean, MapReduce is not something that any developer can grasp overnight. You need some serious Java development expertise. And data scientists are in high demand. I mean, those guys can actually demand pretty high salaries and they are not a lot of them out there. Right, absolutely. So what we want to do here is to make it affordable for companies to actually use Big Data, to use Hadoop technologies. So the same way Talent has been historically democratizing integration, we are set on a mission to democratize Big Data. Why don't we dig into that a little bit more about democratizing Big Data? Exactly, what do you mean? And what's your approach to actually making Big Data more accessible from a data integration standpoint? Right, so as I was saying, instead of having to write that complex, you know, Hadoop programming, MapReduce programs, Piglet in scripts, HiveSQL and whatnot, what Talent provides is an additional level of abstraction. So it's a completely graphical user interface. You design your data integration jobs, drag and drop components, your sources, your targets, the transformations in between, everything is built graphically. And then Talent is going to generate code. So it's not engine-based, it's code generation, which means that that code will run inside Hadoop. So of course we are keeping the power of Hadoop, we are keeping the massively parallel, MapReduce kind of approach, but doing that in a much more simpler way. So the idea is to kind of abstract away some of that complexity. Abstract it graphically and make it more powerful, bring additional features such as, for example, data quality, you know, big data, if it's bad data, becomes big, bad data, and you get into trouble a lot more quickly with big, bad data than just bad data. Absolutely. So you mentioned kind of Hadoop, the open source community, Talent, an open source company. So we're here at EMC World, which is a proprietary, their approach to big data is fairly proprietary, they're opening up some aspects of their business. Kind of talk to me a little bit about how talent fits into the big data ecosystem and how being an open source company either helps or hurts in that environment. Well, you know, you make a very good point. Of course EMC has a lot of technology, and EMC is a huge company, they do lots of the big data cloud and many, many different things. But when it comes to Hadoop, EMC, and specifically their green plum division, is actually embarking on open source technology, MAPR, which is one of the three leading Hadoop distributions, is actually embedded as green plum MR, is part of green plum HD. So it's based on open source. Pretty much everything out there, big data is actually based on open source. You know, even probably the largest proprietary software vendor in the world, the guys who advertise in the Las Vegas airport on taxes here. Yes, those guys. We don't really like EMC, but their logo is red. You know what I came up with? I think I know what you're talking about. I have a deal with Cloudira. So another Hadoop distribution. So everybody who does big data uses Hadoop. And Hadoop is essentially open source. So was that a more natural fit for talent? Because you come from, you know, been around before the big data movement really started, or at least was named. So I imagine your legacy is in helping organizations with some of the more traditional data integration issues, moving data from operational systems to analytic environments, things like that. So how has being an open source company helped you kind of get into this market? And maybe dig in a little bit that some of the differences, the challenges of the big data integration poses versus more traditional data integration, for lack of a better term. Yes. So the first thing is that, you know, big data is not something really new. Companies have been doing big data for a long time. The difference is that they now have technologies that allow them to democratize big data. And the technology is primarily Hadoop. You've got over no SQL databases, but it's primarily Hadoop. But companies have been doing big data with Green Plum, for example, for a long time. You know, it's massively parallel architecture for a columnar-oriented database. And we have had support for Green Plum technology for a long time. So when it comes to Green Plum, now we have combined support for the Green Plum, I would say, traditional database, as well as for the Green Plum Hadoop distribution. And because Green Plum allows you to bring together the conventional, structured, relational data with these structured, polite structured and unstructured big data, talent offers a very interesting value proposition for Green Plum users. Now that's true, of course, of Green Plum users, but that's true of pretty much any database user out there that want to combine, that would say, relational data with big data. Absolutely. So yeah, we're seeing that as a use case among our community at Wikibon. So tell us a little bit about what you're seeing among your customers. Is adoption of what is being called big data happening to a large degree within your customer base? To kind of gauge where in the life cycle, adoption life cycle of big data, you're seeing your customers. Well, historically, many of our customers have had massive amounts of data that they've not been able to leverage efficiently because they didn't have access to technology that allowed them to leverage its big data. You know, you have to be essentially a financial institution or, I don't know, a massive e-commerce company to have big data technology in the past. Hadoop democratizes big data, allows you to deploy essentially free technology or at least very inexpensive technology on a grid of commodity hardware that doesn't require you to invest into super expensive hardware. So talent helps you, of course, implement those Hadoop architectures, do it faster, do it more efficiently and don't require you to invest into very expensive data integration software that would be kind of a contradiction with using the, I would say, open source framework for big data but have to use very expensive data integration technology. Absolutely, so in terms of, when you go into customer situations, are you doing a lot of education around just what is big data? Are we still at that level, do you think, in the market? Or are companies, your customers, starting to ready to take that kind of next step and really start not just maybe experimenting with the Hadoop and some other big data approaches, but actually putting some systems into production? I think that companies today understand what big data is. They understand what kind of big data they have. They don't necessarily understand how they can get value out of their big data. So that's where the education resides now. So they got those, you know, those log finds of historical, customer records, those transactions of all kinds depending on their business and they are trying to figure out how to extract value out of them. Now, a lot of companies are still in the very early stages. I mean, we are still in the early adopter, very early adopter phase. And they are experimenting. They are working by trial and there are iterations. And that's also a place where open source break tremendous value you don't need to engage an army of developers to do that stuff. You can try something, do some what if scenarios, iterate, try again. So again, being able to leverage an infrastructure that's affordable, that can be used without advanced expertise, bring tremendous value for those companies who are trying to figure out what kind of value they're going to derive from big data. Right, that whole iterative approach is very much at the core of what big data is all about, being able to ask questions and experiment before you kind of go into full production and start integrating some of your insights into business processes and applications. So that's a perfect fit there. Absolutely, and that's what data scientists are about. Now you've got different, I would say, degree of expertise for data scientists. As I was saying, you've got the advanced Java developers who have produced expertise, and you've got the guys who know their data, who know what the data is about, but don't necessarily have the expertise to go and deploy it. Right, absolutely. So, you know, like me, this is, I believe, your first EMC world. Is that right? That's the first time I am coming personally. Tana has been a sponsor of EMC world in the past, but that the first time I'm here myself. Likewise for myself. So I'm curious to get your impressions. For me, I think what really struck me was the very start of the conference, Joe Tucci's keynote, and he led with talking about big data predictive analytics as kind of the killer application of the future. This coming from EMC, which has largely been known as a storage company, an infrastructure company. So to me, that says something about where EMC's priorities are. I wonder what's your impression of the show and kind of EMC's evolution of EMC that we've seen over the last couple of years since the GreenFlow acquisition and some others? Well, I mean, frankly, Jeff, I'm blown away. EMC is a gigantic company with a gigantic customer base, with a very broad offering. And a very consistent message for a company this size with the number of products that they have. I think they are doing a great job of bringing everything together. I mean, as you said, Joe's Tucci's keynotes, I mean, beyond the fact that the theatrics were absolutely fabulous, you know, the space stage and that's great. You know, the story around hybrid cloud and big data was very consistent and all the products actually tie into this strategy. Yeah, absolutely. I think they put on a very good show and they are all on message, which as you said in a company this large with this many product releases and things going on this week is not an easy feat, so they should be committed for that. I mean, look at some other very large software on hardware companies. They have been pretty consistent over the years into their positioning. I mean, Oracle is still Oracle. It's bigger than 20 years ago, but still the same type of company. SAP, same story. I mean, EMC, yeah, they are coming from very far. They were storage hardware vendor and now they are all about transforming IT, transforming business, transforming yourself. I've learned via the tagline and very, very interesting message, very interesting strategy. I'm already looking forward to next year to see further steps they take in that evolution. Please, thanks so much for being with us today. We appreciate it. We are going to take a quick break and we'll be back in a little bit with data scientists from Green Plum. They're going to talk about some customer use cases and some other topics of interest to data scientists. So we'll be right back.