 Okay, we're back here live inside theCUBE. This is SiliconANGLE's exclusive coverage of O'Reilly Media Stratoconference. This is day three of wall-to-wall coverage. We go all day, talk to the thought leaders, tracking the signal from the noise. This is theCUBE, our flagship program. When we do extract the signal from the noise, this is day three, we go all day. And then at 12 o'clock, we have a one hour produced news segment that highlights day three and highlights the top news analysis of the conference where we do our patented breaking analysis and break down all the top news for you. I'm John Furrier, the founder of SiliconANGLE and I'm joined with my co-host. I'm Dave Vellante of wikibon.org and John, as you know, over the last couple of years we've been tracking the whole in-memory trend. We certainly heard a lot of that a couple years ago at SAP Sapphire with Hannah and you really can't go to Sapphire these days without getting a big dose of that. We heard from SAP this morning about in-memory. We've had a number of interesting innovative startups on theCUBE talking about in-memory. It's intriguing, John, it's kind of ironic. People joke that Hadoop is the new tape, but in fact, things like in-memory and Flash play a critical role because why people want to have access, fast access, near real-time access to certain data, particularly metadata. So we're going to talk about that. We have Bill Bain, who is the founder and CEO of Scale Out Software, a startup and a company that's focused on this in-memory with a little different angle on it. So first of all, Bill, welcome to theCUBE. Thank you very much. So you heard my intro. It's in-memory is taken the world by storm. We're seeing a lot of activity around Flash. Scale Out Software has something called a data grid. So tell us what is an in-memory data grid? In-memory data grid is a middleware software technology that's actually been in use for more than a decade. This technology emerged in the early 2000s after the explosion in the use of web server farms and it originally was employed to scale access to data by allowing data that's shared across a cluster of servers to be uniformly accessible across the cluster and to be able to handle growing workloads by simply adding servers. It also incorporates high availability and many other features in particular analytics. So what's the new twist that Scale Out Software brings to the sort of in-memory grid business? Well, the new twist is what we call Scale Out Analytics Server. We introduced this product last October and what it does is it integrates MapReduce style computation, parallel data computation into the scalable in-memory data grid and this enables operational data, live data such as web shopping carts or financial hedging strategies or even trading positions in a trading market to be able to use MapReduce to both update their data and to analyze it in real time within milliseconds or seconds. So yeah, okay, I've been wanting to ask you what you meant by real time, sort of people talk about real time, just in time, in time, but what does real time mean to you? Okay, well real time, it's different people will define real time differently. It's not hard real time in the sense of an aircraft control system. It's near real time in the sense that we can return results in seconds, whereas other systems which are pulling data from disk will return results in hours typically or even days depending on the way the overall analytics architecture is organized. So we're organized to analyze data as it's flowing through millisecond by millisecond through a system. To give you an example of a trading platform where market prices are flowing through on a millisecond by millisecond basis, you want to have the ability to do a MapReduce operation, return results and steer a trading decisions within a few seconds and that's what this technology can do. So John and I often talk about things like an ad serving network before you lose the customer or in your situation, the example you just gave before the price changes before you miss the trade opportunity. Exactly. Yeah, I mean, scale out is obviously a term that's been kicked around. People want to scale out. They want to scale up and scale out with horizontal scalable systems. The need for performance and security is obviously top of mind. I was just talking with the CEO of DataStacks and their specialty is real time with Cassandra. But there really isn't a lot of real use cases out there for real time. We start in Paula with Hadoop, but there is a lot of need and the people who are doing it are doing it in memory and what we're finding is there's those flagship applications out there, although small relative to the overall market right now but it's growing very rapidly. Can you talk about your view of the dynamics of that in-memory database? Because now you have multiple tiers. Now you have in-memory, you have flash and you have disk. Right. So first of all, in-memory data grids can be coupled with SSDs to expand their storage capacity. Although today you can store a terabyte to 10 terabytes of data in memory on a large cluster. So it's within the scope of something like 60% of all analytics applications. That said, most of our customers, we have about 370 customers over an eight year period. And our customers are typically storing less than a terabyte, typically hundreds of gigabytes of fast changing live data, whether it be shopping carts as I mentioned, or it could be manufacturing data. We have a very large new service. It's the caching news data that's changing by the second. So there are many use cases actually. Gaming servers, the one we heard yesterday in the keynote would apply for analyzing the game state and being able to optimize the experience. Now what's interesting and unusual is that these are operational systems. They're systems in which the data is being updated and used in real time. So what we're doing is layering MapReduce analytics on top of that and bringing the power that we've seen with Hadoop to this community of users, which is typically a different community than the analysts and BI experts that we normally see at conference like this. Talk about the dynamic there because there's a lot of use cases right now that is cutting edge with live data. It's real time where there's financial services you mentioned games for example. Talk about where the evolution is on the enterprise side because the enterprise side is moving fast to dashboards and real time business management. You know we were at SAP as Dave mentioned, they talk about the speed of business. So can you talk about what your experiences are and talking and seeing the enterprise and just your perspective on that market and how fast it's changing? Well I think what we're seeing is that we're more and more incorporating this technology to accelerate operational systems. We're not seeing this technology used for example in today although we will see that change in systems like we saw introduced by Intel for caching HDFS. So with those platforms, we're starting to see memory coming in as a caching layer but we're talking about bridging this whole usage model from the high end that we're seeing these technologies like from EMC and Intel into operational deployments that have been ongoing for many years. Dave what's your take on this conversation because we had a great panel on yesterday around high performance computing and obviously the scientific side we're talking about some Python developers and you're seeing a lot of the scientific community obviously active in this space because you want to have a process but also high performance computing vendors are coming in with God boxes and large systems. What's the trend there? I think that, well first thing is we all know that the disk spinning disk has been a bottleneck in systems forever and the best IO is no IO. And so with the advent of Flash and the productivity impacts of putting data in memory you're starting to see clearly justifiable expenditures on these types of technologies. So that's sort of points one and one A. I think the second thing is there's a lot of confusion in the marketplace. Is it in memory? Do I actually, you were talking about before Bill, you can actually add a layer of Flash to persist the storage. It may be a little bit slower from a response time standpoint but it's probably substantially less expensive. So there's a balancing act going on, a lot of experimentation going on and people are trying to squint through that and learn more. One of the companies that we had on at Oracle Open World Aerospike sort of does that. Focuses mostly on the Flash piece and claims it's less expensive. So that's a lot, customers are confused. So help us understand kind of what the right use case is maybe for each of those types of architectures. Right, well so of course traditional how to map reduce is great for analyzing petabytes of data that are offline have been taken from an online system or even a system like Cassandra or MongoDB and moved into HDFS for analysis. Another use case of course, is to store data directly in memory or cache it from HDFS for a smaller data set and be able to do quick turns to do many analyses on the same data repeatedly and to lower the latency, remove the IO process, the IO time and the network time which are the two key bottlenecks that we see in Hadoop. And we're going to see this technology employed in that area more and more to reduce those two key overheads. Yeah and as you eliminate this, particularly the mechanical spinning disk overheads, the network bandwidth becomes a key bottleneck. And then of course the other point I guess I would make is there seems to be a lot of activity around metadata access and metadata management and accessing that from server memory as opposed to slow disk. Absolutely and that's been used for many years. I think companies like Iceland have been using in-memory storage for metadata for several years. I think what's really emerging in this conference that I've seen that's interesting is that the increasing use now of the Hadoop infrastructure for relational database queries to be able to speed up SQL queries. And our experience over about a 10 year period now is that doing query, parallel query can create a new bottleneck which has not been discussed yet that I've seen at the show. And that is the bottleneck when you return large volumes of data that have been queried. That creates a network bottleneck there and then a computational bottleneck. So one of the things you can do with MapReduce and we're doing with our technology is unrolling that and moving the MapReduce back into the query engine so you don't just do the parallel query but you do a sophisticated analysis of the data in place. So you eliminate moving the data off the server in order to both query it and then analyze it. Yeah, you know that's a great point. I don't know if it's, John, it was Alistair or Ed Dumbbell yesterday was on theCUBE saying they had a friend that they did some work and it took them actually four hours and 15 minutes to return the result. And they said, that's strange, four hours and 15 minutes. Was that, oh, it took four hours to move the data and 15 minutes to analyze it. So to that last point, Bill. Okay, listen, we're out of time. Bill Bain of Scale Out Software, really appreciate you coming on theCUBE. Good to have you. And we'll be right back with our next segment. Thank you.