 It's The Cube. Here is your host, Jeff Frick. Hi, Jeff Frick here. We are on the ground at the West and St. Francis in San Francisco, California at HBaseCon 2015. It's a big conference. Actually, there's about 500 people here. We wanted to come out, see what's going on, and get a little update. And I'm joined here by our latest analyst from Wikibon, George Gilbert. Welcome, George. Thanks, Jeff. Good to be here. Yeah, so have you introduced our guest? All right. We have two distinguished guests from Bloomberg, Sudaris, and Matthew. They were keynote speakers this morning talking about how they're applying HBase to some of the pressing financial applications that we're struggling to handle with SQL databases. Welcome. Thank you. Okay. So, what did you guys cover in your keynote this morning, or was it kind of the focus of the keynote? I think there are a few broad topics, and part of it is about why do we have this effort in the first place, and what do we hope to accomplish? Bloomberg has been building software systems for over 30 years now, and we have hundreds of thousands of functions and tens of thousands of databases. And what do we want our infrastructure to look like in the future? And what are the pieces that we see coalescing around that? We use a lot of databases. We have needs for high performance, high reliability systems with analytics built on top of them. We see HBase as part of a broader foundation, including analytics for where we see a number of our systems going in the future. So we gave an overview, and Sudarshan also offered a glimpse into some of the efforts we have for driving systems forward into the future. So tell us that, you know, financial services has been sort of ground zero for the spread of relational database technology along with telcos, always the leading edge sort of industries. Like for you guys to step outside that comfort zone, what were some of the catalysts that you took, you know, that pushed you? Sure. I guess I would think of the question in a slightly different way. Telecommunications and financial services have a need to process a lot of information in a reliable, independent way, and so you need to make sure that you have systems that can back that up. It's not that we're unhappy with SQL, far from it. We actually see these systems converging, and things you get with a traditional relational database are quite useful from easy ways to select data to transactional semantics. And we think all those things are actually going to come to many of these distributed systems in the future as well. So really the driver behind what we're doing is how can we consolidate all of a lot of our infrastructure around fewer, faster, and simpler systems where we don't have to build all the infrastructure and invent it and support ourselves so that we can take advantage of developments in the external world without being beholden to a single vendor. Those steps will actually take time though as the systems mature. So it's not that we're stepping away from a comfort zone. It's that we have a plan for the future. Okay, this is very, very significant because what you're saying is you expect the capabilities from the world that we've lived in for 30 years, and that's not going away. But it sounded like you're saying many of those capabilities are going to grow out of this new foundation that people termed no SQL. How do you see that progression going? So I think that when you look at these systems, the things that people think of as no SQL today were designed to handle processing of batch analytics for people like Google. I have hundreds of petabytes for crunching webpages to serve up to two millions of users. Those systems will be adapted to cover an increasingly broad array of capabilities that more and more people need. So we have to see them moving towards our use case. And our premise from the outset was that if you start with a system that's known to have been operational in production for a long time where the theoretical underpinnings are known, so if you look at HBase, HBase is derived from Big Table, which has been running at scale at Google for well over a decade. This in combination with some well understood database techniques essentially create for distributed MPP style analytics. And Sudarshan, do you want to add to that? Sure. You talked about financial services having a strong affinity for SQL databases. Financial service companies are risk-averse to some extent, but they've also been ground zero for technology innovation, even though they may not talk about it quite a bit. At Bloomberg, over the years, we've built different systems using whatever technologies were prevalent during the course of the day. And we've been quite effective at being able to scale them up as our data volumes have grown. But the holy grail is what functionality can we put in front of our clients so that they can answer questions that they could not answer today? And we see a lot of promise in many of these open source technologies. And the main challenge is how do you take these technologies that were designed in the big data context and make it work really well for the fast data, low latency, interactive use cases that Bloomberg has? So do you see that as kind of a big differentiator when you move to financial services and what you guys do is really the latency and the transactional speed as opposed to, like you said, kind of the batch things and stuff that Google was using this for forever? So Bloomberg is a premium product. And so our customers have very high expectations about the resilience of the service, the performance of the service, and any service disruptions are very disruptive to our clients' workflows. So as we adopt these various open source technologies, there is a lot of work that needs to be put into the underpinnings of these frameworks to be able to get them up to our service level expectations. And that's what Matt and I are engaged in, and it's been an exciting show. But following on that, like the traditional SQL DBMS, they scale pretty well. They have the hardening that's gone on for, as Oracle says, 37 years to optimize and tune that query, you know, that query optimizer. And they've been hardened over that same amount of time. And what is it that, why is it that that infrastructure might not grow into the future to the extent that now you're moving with a community that's putting similar capabilities on a new foundation? Sure. Great question. I think some of that actually fits into the earlier question of how do we differ from the classic original use case? And I think Siddharthian nailed it right on the head. It's not just latency. It's also a premium product where reliability is critical. There's no question that traditional DBMSs, and you mentioned Oracle, for example, are robust and well understood and powerful. To be honest, we don't actually see them as being different from what we're doing in some sense. We think that all of these systems are converging towards something unified. So if you imagine the spectrum of a pure, random key value lookup, something like memcache on the one hand, to a single machine database, traditional relational database on the other, there's a spectrum in between. And most of these systems are morphing towards essentially the benefits you get from a traditional relational database where you have transactional semantics and SQL and fast low latency requests. But that is also resilient and distributed across a cluster, but where you're not beholden to a single vendor. So the real difference is this is a system that instead of starting on a single system and then spreading to multiple machines, it starts on a multiple machine and then adds the speed and performance and efficiency of the single machine systems number one. But number two, the rate of progress of these systems is very swift because of the size of the community that they've developed. And that is also critical to us is that we don't want to build our systems around something that's dependent on a single commercial vendor. But talk a little bit about the juxtaposition between you have open source, you have the active community, they're building stuff, they're innovating quickly. But now you lose this one throughout the choke, right? So then you get into this whole conversation of open core versus complete open source, and then you've got a premium high value product, you need service, you need support when you need it. How are you guys kind of integrating an open source approach into a really high value, high performance type of system? I would say carefully and thoughtfully. And ultimately, what is the value of being able to call a vendor? The truth is, if there's a critical problem and you need to resolve it within minutes, a vendor calls too long anyway. The real advantage is in having well understood, well tested components that you understand and have run in production for a period of time and other people have also tested, that's where real reliability comes from. That an actual production experience. When we develop a system, we don't just toss in a production and see if it works, and then if something goes wrong, call a vendor. Whether it's open or not, there's careful and ongoing testing. Destructive chaos monkey like work to see, does it work during testing? And then our new systems generally have to run in parallel for an extended period of time with the ones we have at work. So we can verify, have we missed anything under actual real world conditions for an extended period of time? So there's a very thoughtful engineering approach, which I would say, Siddarshan really understands quite thoroughly too. Let me ask you about the economic aspects of that, not being beholden to a single vendor. That when you look at the price list of say a premium traditional vendor, you add up all the options, it's like 150,000 per processor core. Not per processor, but so like a 16 core system is 16 times 150,000 at list. How do you see the prices deflating to what extent? And then what does that enable you to offer your clients that just would have been too prohibitive before? I would like to answer the question slightly differently. Bloomberg is a very profitable company and we can afford to have significant capital expenditure. So we are not investing though economics matters, that's not the primary motivation for pursuing open source technologies. The main reason is how do we leverage the great work that other people are doing and how do we contribute the insights that we've gained in developing similar systems in-house and be able to work as part of the larger community for the greater good. Economics matters, but I would say that's not the primary reason for us to be involved in open source technologies. Just a follow-up question in terms of getting that through to management that don't necessarily understand the technology, but was that a hard sell? Do they get kind of the value of this community and this combination of being a proprietary company selling really interesting stuff but also leveraging open source technology with a broad community? And I presume if you're an active member, you're contributing back to it. As you said, how do they get it? That is actually a really interesting question because there's a big shift going on underway in many large companies in the country and financial services also. It was once the case that a lot of basic computer science techniques, how to have a fast and reliable low-latency distributed database, for example, was a fundamental competitive advantage. You built that system in-house, you didn't tell anybody what you did and that was part of your secret sauce that made you successful. That has ceased to be true for essentially everyone. The only place where that's still left is in some edge cases at places like Google for creating world-spanning databases with GPS clocks and so on, a problem faced by very few people in practice, which is the only reason it hasn't been assumed, into the larger whole. I think that we've witnessed a dramatic shift inside of Bloomberg as we've come to this understanding in recent years and we have very supportive and enlightened management that has really enabled all the things that we're doing. And part of what makes that so interesting is I can't reveal any names, but there are other famous technology companies where people say, we wish we could do that and how did you guys get there? And really part of it is vision in the part of our management. So it sounds like it's not just economics and it's not just beholden to a vendor, but it's a new mode of innovation, which is a faster pace. Is that fair? It's a faster pace. It's the size of the community and the amount of know-how and the economics and not being beholden to a single vendor. It's not one or the other. So it's all of them together. The whole is greater than the sum of the parts and it makes for a compelling strategy. Well, that's great. We're getting the hook. It's a great close. So, Bats, Undershark, thanks for spending a few minutes with us. A great story on Bloomberg. I remember back in business school, Jeremy Siegel used to play that thing, like a concert piano. I don't know if you've ever got that experience of you get a chance to go to Philly, go check it out. But I'm Jeff Frick. We are here at HBaseCon in San Francisco. You're watching theCUBE. Thanks for watching.