 The Cube at Big Data NYC 2014. Brought to you by headline sponsor, Juan Disco, with support from EMC. Mark Logic and TerraData. Now, here is your host, Jeff Kelly. Welcome back, everybody. This is The Cube. We are live at Big Data NYC. We're in the heart of Times Square. We're here covering all the action in the city, in the big city. Around Big Data this week, a lot happening. Of course, Strata Hadoop World is happening right down the street. We've got Big Data NYC happening here. We had a great night last night at our capital markets event and Cube party celebrating five years. So we're excited to get started today on day two of our coverage. I'm going to kick things off with Dan McClary, Senior Product Manager with Oracle. Dan, thanks for coming on The Cube. Glad to be here, Jeff. So let's talk a little bit about the vibe down at the show this year. The show, Strata Hadoop World, has moved from the Hilton up on 52nd Street down to the Javits Center. It's bigger than ever. What's the feel down there? I mean, it's a bit different. I mean, I think the core technical audience is still there. But the preponderance of companies who have real solutions that they've brought to market and are in fact doing real business has really exploded. And it's changed the nature of the show, not necessarily for better or for worse, but it's definitely, there's a different activity. There's a different activity on the show floor this year. Yeah, it's interesting. I had a few conversations where we hear there's actual real deals happening down there. And as you mentioned before we went on camera, there's still a few sessions with live code happening. So it's still the geeks and developers are having some content for them, but there's a lot more of a business feel. Yeah, I'm pleased to say that I feel like my sessions still had enough live code in it for me to feel credible as a technologist. But, you know, a number of the sessions focus around how big data technologies are emerging as solutions and how those solutions can actually make it out to customers. Yes, again, we were kind of chatting beforehand. And one of the things that you mentioned kind of struck you was how many products companies are kind of using the term end-to-end to describe what they do. Can you expand on that a little bit? What are you seeing? So I mean, it's interesting, walking around both the vendor section and attending a number of sessions, everyone wants to position an end-to-end solution. And it makes good sense to me because the big data ecosystem is complex enough that to be able to span it is very, very important. Although from our perspective, end-to-end means something a bit different. And our end points are a bit further out. Yeah, well, it's interesting because when you think of Oracle, think of, in a lot of cases, think of Exadata, you think of a complete integrated box, essentially, that you drop in your data center, you get going, you flip a switch and you're off and running. Some people think of that as end-to-end. And now, but applying that concept to kind of the world of Hadoop is kind of a challenge. I mean, you look at the number of vendors out there and there's so many different moving parts in any big data architecture in any given company. It's hard to imagine somebody could really offer a true end-to-end Hadoop environment. Yeah, and we do our best, in fact. We have an appliance and insurance system that's designed for the Hadoop ecosystem. And we attempt to make that as end-to-end as possible. And there are certain advantages that we get from that. I mean, by virtue of our involvement in the Java language, our ability to manufacture hardware, and our understanding of a lot of the ecosystem, there's a lot we can do. But the ecosystem is so fast-moving that the endpoint for us is still constantly shifting as well. The bare metal, the JVM, that's stuff that we can always get right down to. But we are having to keep pace and, in some ways, change the way we work. As the big data ecosystem continues to expand at sort of an exponential rate, it's changing the way we have to develop, the way we think about release cycles. Let's dig in a little bit, because when you're looking at kind of the open source Hadoop world, it's evolving faster than any ecosystem I've ever seen. And put that into context in terms of Oracle's development cycles, and how, if you're putting together an end-to-end appliance solution, a big data in a box, if you will, that requires a lot of pre-engineering work. And there must be a challenge to align the continued development in the ecosystem around Hadoop with developing and delivering a hardened, complete enterprise-ready appliance. How do you work with that ecosystem? It's been an interesting challenge to take what is a very well-established workflow within how you build and ship hardware and software at Oracle, and meld it, I would say, with the best pieces of sort of agile development that end up spurring a lot of the Hadoop ecosystem. And it's created an interesting mix, because hardware timelines are what they are. The testing of hardware takes the time it takes. Our QA process never lets up. We cannot, in any way, sacrifice the standards that enterprise customers have come to expect from us. But this means that we have to change necessarily the way we schedule things. So our customers are always interested in having the latest and greatest pieces of the ecosystem. They want the latest version of Spark. They want the latest version of Impala. They want to make sure that they have the latest version of YARN, and secret management is all integrated. But they also want to make sure that we've tested it the same way we test the database. And what it causes us to do, then, is plant one, plan well in advance, well in advance. We end up looking at beta code and beginning our tests there, and then trying to very quickly ensure that any new functionality we bring is done in an agile way and in a modular enough way that it can be run through our rigorous QA processes. But at the same time, no longer has to take one to two years to actually make it to market. It's been a very interesting lesson for our development teams. And I think it's showing us as a company that there are many ways that we can bring our technology to market. Well, I think that's some really good points. The development of this market, I think, is forcing some of the incumbent players, some of the large vendors, Oracle, others, IBM, and others to really rethink the way they develop and release software and, in your case, package appliances. But I'd love to get, as having Oracle here on the Cube, I'd love to get your perspective on the larger dynamic of big data, what's happening in the Hadoop ecosystem and the NoSQL ecosystem. And now, how that's impacting Oracle as a larger enterprise. You think Oracle, you think database. And this kind of Hadoop and NoSQL movement is really, that's all about the database. How is this impacting Oracle? How are you adapting? How does Oracle see this world? How do they see themselves in this world as we evolve and where big data becomes more mainstream? It's actually, it's been surprisingly good for us. I think if you look back a few years ago, people would declare, oh, the data warehouse is dead, Oracle and Teradata are definitely in for it because Hadoop's coming to get them. And what we found is, and I think most of the people you see down at the show floor, have come to realize that these are adjacent technologies. And the preponderance of SQL and Hadoop, right? The fact that SQL is becoming sort of table stakes for any big data installation is, it's really great for us because we sometimes forget that SQL is an incredibly powerful language, that the declarative constructs are really useful. And it's nice to have SQL be cool again for people to be talking about, oh my, have you heard about optimizers? Have you heard about cost-based optimizers? And so for us as an organization, it's somewhat inspiring because we can look at the work that we've done and say, oh, we do do cool things. And it also causes us to start to think about what we consider to be higher order problems. So if SQL on Hadoop has become table stakes, we certainly have spent a lot of time looking at the space. And we're great proponents of it. We like SQL for lots of obvious reasons. And so we see things like Hive and Hive on Spark and Spark SQL and Clutters and Paula and we really embrace a lot of these things. And we ship these as part of our appliance. Some of these things we use ourselves internally. But when we then came to a point where we wanted to deliver Oracle SQL to this sort of an environment, it seemed like we weren't solving the right problem if we only said, well, here's Oracle SQL that runs against Hadoop. We tried to step back and say, SQL on Hadoop is gonna mature in the ways that it matures and there are gonna be many ways to get that, right? If it's table stakes, you can pretty much get it how you wanna get it. What could we do that was unique? What could we do that was really representative of our capability not only as a database company but as a data management company? And it was to this end that we announced the summer and GA just this last month, a product called Big Data SQL. And this for us is really, this is very cool for us because we're able to say, do what you want in a Hadoop environment. Do what you want in a SQL data store. But when it's time to bring the value created there back to your existing business infrastructure, to your existing applications, you can now use Oracle SQL to effectively query all of that data in place. It's no longer having to pipe stuff across, load it into a database, or wait for someone to develop a better ODBC or JDBC connection to the store of your choice. You can instead say, here are the things I used to run my business. Here's the net new value I'm creating in the Big Data ecosystem. Tying these things together is as simple as saying, oh, just join them. And for us that's been extremely cool and reception's been really, really great. Yeah, I think that one of the areas where vendors like Oracle have an opportunity here is to move a little bit higher up the stack in helping organizations as we talked about. This is really complex stuff and helping organizations understand how it's gonna integrate into their infrastructure and connect systems almost at a virtual layer above kind of the data stores and the databases. Because you're gonna have data, the idea of one physical data lake I think is not gonna happen at most organizations. It's gonna be more of a federated virtual model. And I think companies like Oracle have an opportunity to bring their expertise around things like SQL and other integration capabilities to create that kind of more virtual data infrastructure that allows the business user not to worry about where the data is stored. I mean, some of it Jeff is the promise that Oracle made to its users 30 years ago, which is effectively no application changes and will run your code in more places and faster. It just extends, right? The fact that you have an ORM that speaks SQL, well, I want your SQL to run in more places across more data. I don't want you to have to change your existing investment. I want you to focus on creating new value. And we really view a lot of our work in the big data space at that higher level in terms of saying this is a rapidly evolving ecosystem and if you look at the new ecosystem it evolves far more rapidly than a single vendor could ever really hope to tame. And so what we can do is fill in the things that our customers need, whether it be things around security, things around functionality, things around accessibility. So let's talk a little bit about the, so one of the themes that we've kind of, I've noticed over the last day or so with a lot of the CUBE interviews is we've been talking more about, less about the hardware, the plumbing, the infrastructure, more about what are we going to do with all this data? I think SQL and Hadoop is one part of that, but another part is kind of the, that we've been talking a lot about is machine learning and really applying machine learning to data in real time so when you can impact a customer decision in real time or impact a business process in real time. What's Oracle doing in that space? How do you look at, for lack of a better term, kind of operationalizing in real time some of these insights that people are trying to develop in Hadoop and other big data storage? So it's interesting, I think machine learning is probably a close second to end to end in terms of buzzwords around the show this year. Spark may actually take the cake completely, but it's interesting because as much as there is what you would consider sort of traditional concerns, how do we integrate this? How do we secure this? The rise in demand for data discovery, for data discovery beyond BI, for machine learning and for stream deployment has really exploded in the last 12 months I think and we're certainly trying to make sure that we're well positioned to offer our customer solutions but also well positioned to allow customers to operationalize other solutions that they choose to deploy them or build them themselves. So on the one hand, we announced a product at Oracle Open World called Big Data Discovery and this is designed to allow the business user to do the data discovery, the machine learning and the exploration of data that they need to in a very friendly business user environment, a visual environment but then underneath harness the power of the big data ecosystem. Largely this is built on top of Spark, uses a lot of things that come out of MLlib and really gets you kind of the best of both worlds. Here's a very comfortable business user interface but underneath you have all of the power of the Hadoop ecosystem. On the other hand, we're very active in supporting and deploying analytical solutions be they on Spark, be they on Hadoop or be they within Oracle database. We're big proponents of R, we embrace and support R in a way that many other companies don't and I think we're really looking forward to seeing what applications our customers end up building with some of these new technologies. Part of the reasons that we're very interested in what we can do with some of these new technologies is we're, if we can help our customers operationalize these things and do them well, it'll also drive our inspiration for new products as well. Yeah, absolutely. So you mentioned customers, so let's talk a little bit about that. What are you seeing out there from your customers in terms of the most exciting thing you're seeing them doing around this space, whether it's analytics, machine learning, what are some of the more interesting things and more forward-looking things you're seeing? It's interesting because I think our customers fall into many buckets. The most exciting cases though, invariably come down to customers coming back to us, 12 months, 18 months on and saying, you know, this is data we never could have had before and we weren't sure if it was gonna be useful, but the fact that we have it has actually changed something material in our business, whether it's a quicker time to a decision around supply chain or new product placement or simply being able to provide a team of analysts data that they've been asking for for years. It's really remarkable how that's, it's really remarkable and really heartwarming to hear the stories of we captured this data and you were right, it was great for this process and now we're getting other parts of the business coming and saying, what questions could we ask? What new answers could we find if we had access to this same data that you've captured for us? It's democratizing the internals of companies that work with us in a way that's really, really pleasant to see. Well it's interesting because the idea, one of the big, I think value propositions here, big data is the return on investment you get on data increases the more you use that data. There's multiple uses for any one set of data and if you can open up that data to multiple business units, multiple users, multiple use cases, you're going to get more, that's more value you're getting out of that data. As a capital asset, data is somewhat unique and you can invest it many times. Yeah, absolutely. So Dan, unfortunately we're just about out of time. I want to give you the last word. What's on the roadmap from Oracle in terms of big data, at least what you can share and what's kind of your top priorities going forward in the next six months, 12 months, next year when we're back here, what will be some of the things we'll be talking about? I think Jeff, it's going to be really simple, greater operational simplicity, greater access to all of your data and really just speed of operation, speed of time to value. That's good stuff. Dan McCleary from Oracle, thanks so much for joining us on theCUBE. Appreciate it. Guys, thanks for watching. We'll be right back with our next segment here live at Big Data NYC after this.