 Okay, we're back live here in SiliconANGLE.tv's theCUBE. Here live in Cassandra Summit in Silicon Valley, I'm here with Yuri Cohen. Yuri, welcome to theCUBE. Thank you. And Jeff Kelly, my co-host for this event. Jeff, how you feeling so far? I'm enjoying it. So we talked to a lot of the alpha geeks, obviously, from data stacks as well as Netflix. Adrian was on, he was cool, he was talking about his demo. But I want to get your perspective. What's your take of the event? And share with the folks your take of what's happening here. And then what is happening that they might not know about? Share something original. Okay, so my take, I came here, I was expecting it to be like a medium to small event. And then this morning there were about 800, close to 900 participants and it's mind blowing, right? It's like, you know, crippling over the last few years. So it's pretty amazed. And I'm telling Jeff, it's not just the amount of participants, it's actually the focus and the level of conversation that we're having with people, very professional. I mean, people are coming and they are either engineers or engineering manager and looking for solutions and they know their stuff. So it's, we're very happy with it. I mean, it's been really nice. It's great. Yeah, I noticed the same thing. It's not a lot of kind of novices coming here. Let's learn a little bit about the technology. It's struck me as how kind of mature this ecosystem is, compared to those people who are two years old. So why don't you tell us a little bit about the spaces, kind of what you guys do and then we can kind of go into some of the details of what you guys are doing around the Sondra and show up again. So we've been around for quite some time now. We have two products, the first one is called ZAC. It's called Extreme Application Platform. And start off as a distributed data memory database, kind of similar to what you see today with Cassandra and other NoSQL databases. We took it from a bit of a different perspective. It's all in memory, it's transactional. It's mostly targeted at transaction processing. So our primary adopters were financial services, organizations needed transactions and not lose any message. So it's a bit different for the requirements of NoSQL today. That's one thing we do. And the other thing we do is a product called Cloudify, which is designed to take complex, multi-tier, big data apps to any cloud environment. And managing big data apps is not very easy in terms of multiple components. And Cassandra is just one piece, usually has a dupe and MapReduce on top of that. And your web containers and your SQL databases and caching layers, so providing a way to manage all that consistently and auto-scale it on any cloud environment that's something we do with Cloudify. What do you think about the big trend around solid-state memory and disks, et cetera? Obviously, it's changed the game we heard from Adrian. It's obviously, Cassandra's a nice use case if anyone who has solid-state in their architecture. Obviously, that's where the market's going. How is it affecting cloud in particular? Cloud is now segmented on and with public clouds, all the big rage, OpenStack is a big proponent of creating this whole public cloud Amazon we know about, Adrian talked extensively about that. But also, the enterprises have on-premise data centers. And then you got this hybrid cloud. So private cloud, hybrid cloud, public cloud. Sort of what's happening in this space because now you're seeing things span across clouds. What's going on in those areas right now? So I think it really depends on the type of organization you are and actually one of the stuff I talked about in my session here is how cloud portability is not really kind of something unique or even if you're starting a small company, you need to be thinking about it today because cases like Zynga and Mixpanel, these guys have started off with small setups and then went to public cloud and then realized that for the scale they're at, they probably better off go back to at least some of their capacity to private clouds. So I think what we're going to see moving forward is a mix of both, where as part of the workloads, maybe the most sensitive ones where you need the low latency are going to be focused on private clouds and you're going to see public clouds for most of the rest of the stuff. What do you think companies like Box.net that have been got huge valuations out there? I mean, they're basically just doing file sharing. I mean, is that cloud? It's consumer cloud, yeah, it's consumer cloud. Well, they claim they're doing enterprise. Yeah, but they're not focused on enterprise building applications, they're not a platform, they're really a consumer product and one of their customers is consumers within enterprises, so I'm not sure about the valuations, I'm up to them. Costs to a billion dollars, but then again. It's pretty amazing, I can't tell if it's just quite a lot. You can say it, go ahead, you can say it, it's hyped up. John's trying, he's trying to extract. No, I mean, they're putting a lot of PR saying they're enterprise grade and we cover oxygen cloud and we cover bots and drop bots, but then I think they're consumer applications and they're consumer applications in the cloud are great business models, you got a freemium, you're free and then you got a freemium, you kind of go freemium. Okay, cool, when you start doing enterprise level cloud you have to, there's nuances there that are table stakes, security, data portability and certain data center kind of features that need to be in place and it's very difficult. Can you expand on that conversation because that's what people are trying to figure out. What does enterprise cloud mean? Enterprise ready, enterprise cloud. We know there's a lot of shadow IT going on with public cloud, so that's okay, I think that's just competition. Developers want to play and there's tools in the public cloud and they play out there, but when they prototype it, they got to get it to production. So that's been the big theme here. Talk about that, getting something in production. So I've been shadow, do some shadow IT, now I've got to bring it back to the enterprise and make that ready. So I think in terms of the way you would do that, public cloud versus private cloud is not really all that different. These are the same concerns like you mentioned, security and portability and all these concerns. What I can say is that when you're building an enterprise solution, you can't neglect a lot of the things that people neglect when they go to public cloud. You have to start for accounting for availability and one of the SLAs and not everyone on Netflix but they can actually build a huge infrastructure around that to be able to cope with all those failures and data center crashes and all that stuff. So if you want to do enterprise in the cloud, that means that you're probably going to have to design your applications for that. So let me ask you about virtualization. Obviously, virtualization has been a game changer. We saw VMware evolve from a company that had a hypervisor, you had Citrix out there and then Xanon Citrix. Then VMware is now moving with that data center focus or enterprise now with Cloud Foundry. So it's pretty clear what they're doing, right? Pat Kelsinger's in charge. VMware is going to be a whole different company than they were. But is it viable with virtualization that it's totally game changing? Also, another, we saw that Nasir was bought out by VMware for a billion dollars or something. They got an amazing amount. Crazy. Again, they were the VMware for networking. That was the quote, VMware of networking. Is there a cloud virtualization play like that where virtualization actually changes the clouds that you guys are doing? And is it that easy to just bucket something into a bucket saying you guys are doing virtualization with Cloud? Explain what that means, virtualization in the cloud. So we're actually doing, I would say something a bit different. We're allowing you, if you have an application, right? If it's a legacy application and it's very complicated, we're going to allow you to take the same application and put it on your data center or on a virtualized environment or on a public cloud without changing a single line of code. Obviously, it's very important for two purposes. One is when you're trying to decide where you want to deploy it, where you want to put this application, you want to be able to test things easily. You want to be able to deploy the same application on multiple environments and see what works best for you. Even across multiple public clouds. That's one thing. And then as your business evolves and your requirements change, you want to be able to switch pretty easily. What do you think about the database? So obviously, infrastructure as a service, platform as a service has been a great enabler for new things. Enterprises are realizing now they could do more things now with Solid State. So internally, inside the firewall or on-premise, Solid State has offered up great new provisioning opportunities with having storage better, making caching layers. So some coolness going on there. You got some coolness going on in the cloud. What's happening in between there? If you're a developer, can you connect the dots there? I mean, is there a way to create a cloud infrastructure? Because if I'm an enterprise customer, I say I want to do the cloud. I want to build my own cloud. How do they do that? So, first of all, Solid State is, I think it's a game changer. Especially with the kind of stuff that Amazon is doing, and we've seen it. If you attended Adrian's talk this morning, it's really about doing a lot more with the same resources. And traditionally what happened in cloud is two things regarding IO and what happens with just writing stuff to the disk on the cloud. So one thing is that it's very slow. And if you compare a server in the cloud and a non-virtualized server in your own data center, sometimes it would be 10-fold faster. Just one server. Just doing IO on that server. So it's one thing the SSD is doing and the stuff that Amazon is doing now. The other thing is the fluctuations that you would experience with traditional cloud, with IO and traditional cloud. So up until now, before you had this SSD, you could get a decent performance on average, but then you would get those spikes that something that would take you 30 seconds would suddenly take you 20 minutes. And for no apparent reason. So you have to accommodate for that. And you can just browse through the Netflix blog and see how they do that, like writing stuff for three or four locations just to accommodate for that. So it's definitely a game changer on that respect. If you're building a private cloud, I would say SSD is definitely something you need to take into account. If you're an application developer, it really changes the way you can think of things. So we traditionally come from the in-memory database or in-memory caching area. And SSD gives you a lot more flexibility in the transition between memory to disk. So you can have this kind of middle tier here that is almost as fast as in-memory, but then it's much more robust, it's much more scalable, it can store much more data. So Yuri, tell me what you think about the different horses on the track, as we say. Mongo, Cassandra, HBase, you've got SimpleDB, you've got DynamoDB, the Amazon one. Would I miss anyone? Probably if you're a riot. Right, yeah. Couch, yeah, there are a bunch of players there. So break them down for us. Well, Mongo seems to be the most vocal about against Cassandra. It seems to be that kind of developer crowd. HBase has emerged, just had Duke HDFS. But is there the benefits? I mean, Cassandra, also we're at Cassandra Summit, so we want to compare and contrast Cassandra versus some of those other ones. Is there a winning use case for each one? That's what we're trying to explore. Yeah, definitely. So Mongo, I would say the biggest benefit there is usability. So it's so easy to pick up and get started with that most developers would default on that database. And I can understand why, and you've got drivers for every language. So no matter what you're doing, Ruby, JavaScript, Java, there's a solution for you there. Cassandra traditionally was a really good database as far as the distribution models, as far as the clustering goes. It's much more, in my opinion, it's a more robust implementation than Mongo. It's- And what ways? Distributed database or? In the way that, first of all, the algorithms there are more kind of pure it's easier to set up because everything is symmetric. So if you want to set up Mongo in a cluster, you have a bunch of roles and components there. And if you want to set up Cassandra, just kick start a few processes and they kind of figure out the roles between themselves. Such easier to do with Cassandra. Where Cassandra fell short up until recently is the usability part, the API part. Two years ago, it was very hard to get started with. It's still harder than Mongo, but things are improving. And that's one of the stuff I'm seeing here is this great improvement in usability and the stuff that Netflix is doing, for example, around APIs. So he took the Cassandra API and they created their own kind of layer on top of that, which makes things a lot simpler to work with. We have CQL, which is similar to SQL and allowing users that are familiar with SQL to kind of work with Cassandra without picking up on all the unique terminology, so to speak. So I'm very, very optimistic about where that's going. HBase traditionally was, it has its use cases. Obviously Facebook is a big proponent of that. They've done a lot of work around improving the code base from the last 12 or even 18 months. Their inbox and all that functionality is now based on HBase. HBase itself, it's a bit more difficult to set up than Mongo and Cassandra because it sits on top of Hadoop and then you need Hadoop infrastructure to begin with and only then HBase. So in terms of setup, I would say it's the most complicated industry, but both HBase and Cassandra are very good at write scalability and just accommodating for application that massive write requirements. I wonder if we could talk about analytics a little bit. So certainly Cassandra had known for really supporting real time transactional applications, big data applications, but then when it comes to the analytics part, we talked a little bit earlier with Billy Bosworth, CEO, we're talking about partnering with some of the BI players to do some analytics. I know Gigaspaces recently rolled out some capabilities to kind of build your own real time, big data analytics platform. So big data analytics is hard enough when you add the real time component. Now you're really talking about some of the challenges technically. Talk a little bit about both the technical challenge of building that kind of platform and really the use cases. What does it allow you? What kind of insights does it allow you to get? You can't get from a more traditional batch type of analytics. Sure, so in terms of, I mean, the driver for that, right? Let's start with that. So everyone are, you know, Hadoop is very big and you see a lot of companies behind it and a lot of use cases, a lot of tools. So I'd say the ecosystem is pretty mature and robust, but Hadoop is offline, okay? So it's very good that you can push a lot of data very quickly to Hadoop, but then if you want to process that, that can take anywhere from minutes to hours. And that's, in many use cases, just not feasible, right? If you're building, if you're Twitter for example and you're building analytics dashboard and if you're Facebook and you have those widgets on people's website, you want to give them the insights as fast as you can because that means money. If I'm publishing some content on my website and I'm seeing that it's not getting the traction I want, it's getting negative traction. I want to change it very fast. I want to be able to do that in real time. It doesn't help me that Hadoop can give me the insights a day or two days later. So we're seeing a lot of real-time analytics tools being created. So Twitter Storm is another framework that people are talking about, kind of the online version of Hadoop. What'd you hear about that? Do you play with it at all? Yeah, yeah, definitely. So it's, I like it a lot. I mean, it's very good for what it does. I think it's going to pick up a bit more. It's kind of the real-time version of Hadoop. So instead of pushing data to HDFS and then kind of processing it and offline, basically you process the data as it comes in. Kind of passes through that real-time analytics layer and then you can store that data in Hadoop and do later analytics to touch what you're doing. Yeah, but you want to do the real-time stuff as it comes in because that's when it matters, okay? So simple analytics, you can do obviously pattern analysis and I don't know, fraud detection, stuff like that, but what you can do is you can actually count things, do some simple queries, all the stuff, for example, if you want to build a web analytics dashboard, that's very relevant. If you want to, for example, a lot of our customers have been doing that for years with financial services, like processing trades and antics and stuff like that, do you want to do it online? So that's the kind of stuff that you can do with real-time. So we did around, we had this in-memory data grid platform, so we built some eventing capabilities around that to be able to handle these kind of workloads and these kind of scenarios. Interesting, so kind of, is it an alert-based system whereas where you detect anomalies and then you kind of push that information to either to another application or process or to a human to take some kind of action? Is that kind of? So it's up to you, it's more of a development platform that allows you to distribute the processing of those events across multiple nodes to store information in-memory in a reliable fashion, that to do it transactionally and basically give you all the framework that you need to be able to build such applications. So what are the challenges around doing that kind of work, real-time analytics in a big data environment versus complex event processing, CEP engines have been around for a while and as you mentioned, financial services firms have been doing that kind of thing with smaller data volumes and maybe not quite as high velocity. So what are some of the challenges, technically speaking, when it comes to applying CEP-like technology, real-time streaming analytics to big data? So it's about the sheer scale of things, right? So if you have enough capacity in one server to process events, then you can do it much more easily. Because you have a single source of information that's co-located with where you do the processing, you don't need to think about how you distribute the data, how it's load balanced across those servers, what kind of data you would need to access from each node, because if you're distributing and then you're processing an event and then you realize as you're processing this event that you need data from another server or from a bunch of other servers, it's not going to scale. So there's a lot more thinking about what's your use case, what's the most important piece of your application that you want to scale versus traditional frameworks. That's a challenge with those kinds of platforms. So Yuri, we're going to end this segment, but I wanted to ask you one last time, share with us what you're working on, your framework, what your company's doing, get a quick plug-in and what you're doing here at the event. Okay, so I mentioned the two products that we have. So we'll start with Cloudify. So with Cloudify, what we've done, Cloudify itself is, like I said, it's a framework for pushing big data apps to the cloud. One of the key components that we're seeing in the market, our prospects and customers are demanding for its Cassandra. So we've built a very nice, we call it recipe for Cassandra, which allows you to embed a Cassandra tier within your big data application, push it to any cloud environment, scale it dynamically and basically accommodate for all the management aspects of what it takes to manage a Cassandra on the cloud. And as far as the other platform is concerned, we've built an offering where you can actually put this in-memory event processing grid in front of Cassandra and process stuff as it comes in real-time transactionally and then push it to Cassandra as a back-end and then do the long-term stuff over Cassandra, store information with almost unlimited capacity, which you can't do within memory. Awesome. Here we come with VP of products at Gigaspaces. Thanks for coming on theCUBE. Appreciate it. We'll be right back with our next guest after this break. Thanks a lot. Thank you.