 Okay, we're back here live in San Francisco, California. This is Oracle Open World Coverage. This is theCUBE, our flagship program. We go out to the events, extract the signal from the noise. We go out in the ground and get all the action. Talk to anyone we can talk to who's got the signal. Bring that to you. That's what SiliconANGLE and Wikibon does. I'm John Furrier, the founder of SiliconANGLE with my co is Dave Vellante. And our next guest is Steve Karam, who's a data and content technologist with John Wiley and Sons, which is in the book business, but the book business is changing and we were discussing before we came on the notion of metadata, which became a mainstream topic outside the geek circles when we heard about the NSA monitoring us, the word metadata became something of a word, metadata, what the hell is that? But we know the value of data and with big data, new business models, new innovations are happening and it's all about access and discovering content and information and people. Steve, welcome to theCUBE. Thank you. We were talking about books, books being digitized, it makes sense. You got e-book, e-books are out there, great. But authors have domain expertise, they have riveting stories that they tell and they actually write content, which can be captured as metadata. And this is interesting because now you can use that content to go out and explore new data types and the theme of big data is about unstructured new data types that are not necessarily are in relational databases, some are, some aren't. You got taxonomy, you got folksotomies. Tell us a little bit about your vision and the things that you're working on because it's really compelling about how you're using data in the business that you're in and how you're using data to drive in new user experiences and new benefits for authors. Sure, well, you know, what's amazing to me is you were mentioning big data and bringing big data into this. And big data is all built around sensors around bringing in as much data as you can from as many sources as you can. And so, the idea is that an author writes a book and that's a great thing, but that author's got a million other ideas in his head. He has come up with so many other different concepts as he was writing this book and we want to know those things. We want to capture those things. Those are great things for us to learn and for us to see. And if we can capture those kind of concepts then not only do we get that great written work, hopefully a great written work, right? We also get data that can build communities. It can build more content. It can give him tools to write more good content for us. It can link to other things. And then you've got, you bring in things like sentiment analysis and semantic searching and we can really get a good feel for the field that he's writing about, the topics he's writing about. We can build a package of content and data. It's an incredible way to bring it out. One of the interesting things and why we wanted to speak with you is, one, there's a data angle here around big data, databases and technology under the hood. But at the higher level there's a confluence of human and machines. We think of Google indexing the world's data and making it searchable. That's the web. Right now we're in a whole other paradigm of the social web, people web, the connected web, the mobile web, whatever you want to call it. You now have access to ideas and thoughts and the notion of machines brings scale, level of scale to the table. At the same time it doesn't necessarily replace the human touch. So what's interesting is the confluence between the human knowledge and machines. You put that together and you start deploying some of these scalable learning machines, machine learning, AI-like functionalities, ontologies, you can bring a whole other level of data acquisition. To take the burden off the author or the writer to do more. Explain what that's all about. The way I see it is you've got, again, your author with his many ideas, he's got his what, his paper and his notebook and maybe he's got Wikipedia open and he's doing some searches and he's trying to find things out about what he's writing about. If he can have tools, if he can have tools that are built on the things he's writing, or he or she, if you're writing a book and every word that you type can bring up a world of new knowledge and this is all based on what a machine thinks you should be writing about. This is all based on what other authors of your kind have written about and we can bring these things to the author and we can bring it to them real time and we can get better content out of it and then there are some cases, while the author is always very important, that content builds a world of its own. That content moves on and it writes its own stories. If we include things like enterprise level searching, if we include things like large databases that hold the metadata that surrounds these things, we can truly have stories written by these machines. We can have stories written by the system based on what our authors have done with it. It's smart content. So, you guys were talking earlier about the post-production value of that. What have your experiences been in that regard so far? I think the value is outstanding with it. Once you have that kind of information, you can use it for a world of possibilities. You can use it to discover new things. You can use it to enter new fields, new areas of thought. You can use it to enrich your existing content, find gaps in your content. You can use it to find out places where you should be going that you're not already and then you can create data-driven applications. These are things that can be brought to market very quickly and there are ways that you can bring this content and this expertise out in a packageable format. It's good for mobile. It's good for any purpose. Okay, Steve, I want to turn our attention to Oracle. Of course, you are the at-oracle alchemist on Twitter. You're an expert, you're a former practitioner in Oracle. What's your take on Oracle Open World? This is the fourth year we've been here. I think you've been to several more than we have. But what are you seeing here this year? What's exciting you? What's your take on where Oracle's at? There's been a lot of excitement this year. It's been a very exciting year. I know that every time there's a new release, a new major release of Oracle, there's always great technologies that come out of it. The 12C technology this year has been phenomenal. It's Oracle's biggest architecture change in a very long time. The multi-tenancy features, the cloud features, the plugins into things like Hadoop for the big data. It's absolutely amazing what they brought out and the community has responded. As I was mentioning to you before the show, I've been on a mission since when Oracle 12C came out in June 25th to aggregate community content. This is not Oracle official content. This is community content by bloggers, etc. In the three months since 12C's come out, I've aggregated over 387 articles written by over 130 different authors around the world on Oracle 12C. It's just a response from everyone has been phenomenal. Let's unpack some of those. Multi-tenancy cloud, Hadoop plugins, community response. A couple years ago we heard Larry say multi-tenancy at the application level is wrong. A lot of people myself included interpreted him as saying multi-tenancy is a bad thing. He wasn't saying that. He was saying it's got to be designed correctly and he's sort of taken shots at Salesforce. He takes shots at Workday for doing multi-tenancy at the application layer, not in the database layer. Can you help us squint through that? What does he mean by that? Do you agree with it? Why multi-tenancy at the database layer versus the application layer? What's the benefit to the customer? I would say at the database layer the main benefit is transparency to the application. That is a good benefit as far as not having to change the most time-consuming money-consuming part of your development cycle. If we put multi-tenancy in the database, the database is the most cumbersome part of our application world. And if we can put that in a cloud environment with multi-tenancy built in and we can give applications their own pluggable databases and make it very easy, very agile to move them around to back them up to clone them, we can facilitate newer methods of development, we can really make development take off by making it simpler on the developer on SCLC we can make it easier to get those applications to market that way. You give high marks for that. The C in 12C stands for cloud. Larry told us last year about that. So what makes 12C, 12 cloud? That's an interesting one. You look at the scope of the letters. 8i came out, it was i for internet because of Java. We had the G, the 10G come out and a lot of people had no idea what the G meant. They thought grids were clusters and clusters were grids. It took a while and it took a lot of Larry talking to let us realize that. For the cloud, I think we're going to see more features soon. As of right now that feature is multi-tenancy. It's a system that we can put in the cloud and we can have app containers and that's a great thing. But the upcoming features, one that's currently in beta right now, is self-provisioning of these pluggable databases. And the ability for a development group or any group to automatically push a button and get their own provisioned multi-tenant not multi-tenant but container system is tremendous. That is the true cloud ability and that lends to software as a service. And then the Hadoop piece, John and I are always joking on the queue. Oracle will be late to a market and then they'll maybe acquire a company or develop something and act like they invented it. So Hadoop, now you're seeing all kinds of stuff. We heard NoSQL, we certainly heard Hadoop. We heard all kinds of big data discussion by Mark Heard on Monday. So what's your take on Oracle's play in big data generally in Hadoop specifically? Honestly, I'm very optimistic about it. If you compare it to their other plays that they've done, in this one, they're almost embracing other organizations. So not only have they developed this hardware, like their big data machine, they've developed this hardware that has really great enterprise quality equipment. They also have opened up connectors. It works with Cloudera. It works with Hortonworks. They've got these connectors. They realize that, yes, we must adapt and work with other companies that are developing this technology. In fact, I was talking to someone from Cloudera just yesterday and the fact that OBIE software can now write workflows in software like Uzi instead of just PLSQL, instead of just sticking to Oracle-only software, they've opened it up. And I think that's a good thing for them. Well, we had Mike Olson on yesterday and he's clearly happy with the Oracle relationship, primarily because he's driving business for Cloudera. I made the comment that a lot of times relationships are one-way streets. What gives you confidence that this won't turn out to be a one-way street? What's your thought on that? Will Oracle ultimately try to co-opt that entire ecosystem or is that not going to happen? I think that Oracle is definitely going to try to co-opt it. I don't know if they'll be able to. This is a larger ecosystem than we've been dealing with. RDBMS is it's a big deal, of course. It always has been. But it's now just a piece of the data puzzle and everyone's realizing that. I do think they're going to try to co-opt it. I don't know that they'll be successful. But yeah, I share that. I guess you could call it pessimism, that that relationship can be one-way. So I want to push on you a little bit because you've got a deep DBA practitioner background. I want to come back to that. But you're very positive on Oracle's Hadoop approach, big data approach. You think this is going to add a lot of value. Clearly it's already adding value. But a lot more to come. The question I have is, can you juxtapose or help us rationalize the typical Hadoop model? Commodity, infrastructure, scale out, no sequel, open source to engineered systems, you know, million-dollar boxes, you know, red stack, etc. Juxtapose those two and again, what gives you confidence that the Oracle model will ultimately thrive? I don't know that the Oracle model will thrive. I can say that it will play alongside. Just like with open source databases, you've got your MySQL, which is now owned by Oracle, but all of the NoSQL databases, there will always be the desire for an enterprise-class system on the side. And while some companies are great with building a Hadoop architecture for $100,000 from, you know, MicroSystem or whatever, they're fine with building their own Hadoop systems, there will always be those organizations that will want large-scale enterprise quality with the big sun on the box, and they're going to want to see that and know that they are getting the best quality out of their hardware. It's what you pay your money for. One other thing is that Oracle's model has always gone around licensing and it's always gone around the amount that they're going to make from support. So they're making these great hardware sales and they're making these amazing machines, there's big data machines and their big memory machines and all of these things, but this is going to benefit them mostly in the software and there will always be the companies that want that. Yeah, you're right, they're in the licensing and the support. I'll tell you the reason why I think that Oracle, the Oracle model will thrive because the Oracle model will morph into this new model. Just as it has, Oracle owns Java, it owns MySQL, all those so-called disruptions they haven't seen to hurt Oracle from a financial standpoint. We've talked about that a lot here on theCUBE, the size of the company, the market value, the cash flow. I'll give back to the DBA piece of it. And specifically yesterday we heard Joe Tucci the godfather of VMware and EMC and Pivotal along with Jeremy Burton and some other EMC executives put forth their vision of a horizontal infrastructure, a horizontal world, a virtualized world and of course Oracle's been contrasting their vision the entire week. They also made EMC a big attempt to speak to the DBA. You understand the Oracle value proposition. It's a luring. You care less about the cost of a disk drive. You care more about the business value that you can drive to your organization and making applications run well. I want to specifically ask you about now that Oracle's into the hardware business from a DBA standpoint does that open horizontal message resonate with you? Does it fall on deaf ears? Are you more attracted to the Oracle message? Sure. It does resonate with me and the reason is because yes, Oracle's always been a stack it up, let's stack it high, let's build big monolithic systems. Then they came out with Rack which came off of OPS and Rack was a great system. It was transparent to the users. It was very beautiful and you could horizontally scale your system. Larry introduced the subject of the grid and now it's getting bigger. What I'm happy about is that that horizontal scaling is now much more possible. What we've got now is we've got these platforms that can grow as wide as we want them to be and they can be elastic as long as you have the right infrastructure and I think that with that infrastructure that either model can thrive. There's going to be companies, government, medical institutions, various others that I've worked with. They're going to want the very tall systems that go as high as you can. They're going to want two of them, one for backup and one for their production, right? But then you've got the companies that are trying to innovate and do new things and they're trying to bring in a lot of concurrency. They're going to like the idea of a scaled out system, a very horizontally scaled environment. So that does appeal. Okay, the other thing that I want to talk to you a little bit about, we're going to sort of get down on the weeds. I know John, you love when I do that. But we're talking about R-Man as a backup approach because you heard EMC talking yesterday about how important backup is. The DBA wants transparency to the backup process. We've heard stories and I want to sort of validate them or deny them that the DBA felt like I don't know if my data is really being backed up. I'm frustrated about that. I go to my storage admin or my backup admin. I ask them, everything's fine. It's all good. But I can't see it. I don't have that transparency. Is that a real problem? Is that just a perceived problem or one that's marketing or is that a real problem to DBAs that folks like EMC and now Oracle are solving? It can be a real problem, particularly when you have a large amount of data that's being backed up and held on to. I've worked with hundreds of clients and with those clients we've had various backup and recovery solutions being used. I would say that for the most part the DBAs know where their data is and they know where to find it. But in a large enterprise with dozens and dozens of databases you have got data spanning back years and we have no one to trust except a library that says, oh yeah it's there. Trust us. So I can see that but not as often as you would think. Steve, I got to ask you obviously in the book business, we're talking about big data and things that you're working on changing the game and bringing in modern aggregation and more rich user experience both for the users, the readers and the consumers as well as the authors, more value. Which is really great. That's the whole big data mission. But let's talk about the state of the current book industry. I mean there are people who build their franchises like John Wiley and those guys and others. It has to evolve. What are the current challenges that the book industry faces in general? Not just you guys, it's just in general the philosophy. Some are going to hold on to the business model, milk that cow until it's dry or evolve and eat their own if you will to bring in the new. Cannibalize their own business to set up the new business. What's your take on that? My take is traditional publishing, there will always be a place I think for good editing and good solid advice on book writing, particularly for highly technical and highly skilled things such as medical, science etc. But at the same time with the growth of your wikipedia and your self-service publishing with your growth of e-reading and the traditional book is going away there's just not as much interest in it. It's still big. Print still has a great place for us. But we have to recognize that just taking pages and moving them on to a digital device, that's also not going to cut it. What we're doing is yes we need our content, yes we need our books, but we need them in new and interactive ways. And some people just try to run events and that's all they do books and events, but you're taking on a whole other discussion, a platform. Yeah I think that data and content as a platform is a very important thing. Big data is a big keyword right now and we all know that. But big content is another big idea where you've got these large quantities of data that's formed into an already existing format. If you look at a book, what is it? A book with 100,000 words is a piece of data, it's a piece of content that's pre-built for you. And if we can take that and we can provide enterprise search we can provide semantic searching we can provide new windows to look into it. That's big content and that to me is a big deal. You know what music to our ears, the cue. We are friends of big content because we pump out more content, more signal into the barrier of the network as we possibly can. Big content is a great analogy. At the end of the day, content is the new thing. You're seeing the trends, content marketing. The marketing to the persona of one you're seeing personalization now take a very unique value proposition where people want individual personalization no more generic, you know, categorical content. So content explosion is really impacting really everything. What's the challenge? What's the hurdle? Is it time? Is it just inertia? What's it going to take? What's your view on that? Well these are great new things, these are great new technologies but some of it is based off of budding concepts that are just now coming out. We've got recommender systems and we've got large scale predictive systems which they require a lot of work. So there's the time involved and then there's just the changing of the philosophy. Publishing companies in general have had a philosophy for a long time and now we're turning into media companies. We are digital media and that transition of thought is a big change and being able to take what we used to think of as a long two year let's say or long drawn out process of getting a good book published is now a quick process. Let's get it published and let's get apps out and let's get a whole world of content around it. I think this is a really big trend big content, big data equals big content big value. You're seeing transformation journalism, venture capital becoming their own publishing houses. You're seeing publishing houses becoming more like search engines and platforms. Big data, big content, big networks that's what it's all about and at the end of the day it's going to be about user intimacy user experience and expectations so that's the big trend. If you want to look at the tea leaves and what's going on at Oracle and the big data world and infrastructure that's futures right here. It's upon us and again it's just the beginning of it. Steve thanks for sharing your perspective. This is theCUBE. I'm John Furrier We'll be right back with Oracle Open World here live in San Francisco. Expand the coverage of Oracle Open World 2013. We'll be right back.