 We're back, and this is Dave Vellante, and this is the Cube Silicon Angles flagship production. We're here at Moscone South in San Francisco. This is the AWS Summit. AWS does probably a dozen of these types of summits around the world. The big conference is, of course, re-invent in the fall in Las Vegas, but these regional events allow practitioners, CIOs, IT professionals, developers, partners to come together, collaborate, share ideas, share best practice. And we're here with Rahul Pathek, who is the Senior Product Manager at AWS, who's focused on Redshift. Rahul, welcome to the Cube. Thanks, David. Great to be here. Yeah, I appreciate you coming on. So Redshift is, we heard Andy Jassy this morning saying, Redshift was the fastest growing service in the history of Amazon, so you've hitched your wagon to a great horse, so congratulations there. Tell us what Redshift is and how it came about. Absolutely. So Redshift is a data warehouse service in the cloud, and it came about really, because our customers were asking us for a better way to do data warehousing. They didn't want to pay upfront fees, they didn't want to have long procurement exercises or complex services. And so with Redshift, we wanted to bring something that was fast, inexpensive, and easy to use, and easy for customers to provision and scale as they needed. So at Wikibon, we've done a number of studies with our members, our practitioner members, on data warehousing, and there's a lot of frustration with data warehousing. I think in many ways, data warehousing has failed to live up to its vision. Obviously has played a big role with reporting and compliance and the like, and certainly analytics and so forth, but at the same time, infrastructure supporting data warehouses, as we were talking about off camera, are broken in our view. I used the analogy of people chasing chips, trying to constantly upgrade infrastructure, patchwork of infrastructure. High degrees of frustration, what's different about Redshift in the cloud? Absolutely, so you're absolutely right that the rate at which data is being generated today and our appetite for analyzing it outstrips our ability prior to Redshift to really do data warehousing at scale. And so with the cloud, we're able to allow customers to provision terabytes and scale up to petabytes of data warehousing with just a few clicks and wait just a few minutes. So that flexibility and ability to change configurations based on data needs is what the cloud brings and that's our new model that we're bringing to market. So why couldn't you just use sort of a off the shelf SQL database or RDBMS and bring it into AWS? Why did you have to build one from scratch? Can you talk about that a little bit? Sure, so Redshift is an analytic data warehouse and so what that means is it's optimized for large scale analytics dealing with large data sets, aggregates, scans, that sort of thing. Traditional databases for transactional workloads are really optimized for lots of frequent updates and so there's a mismatch between workloads and in the case of Redshift, we wanted it to be really, really fast at delivering answers to complex analytic queries and everything in the system from its architecture to its underlying hardware platforms been designed with that in mind. Can we talk about the architecture a little bit? Can you sort of give us a Redshift 101 from an architectural perspective? Sure, so Redshift is a SQL based relational data warehouse, uses a shared nothing scale out, massively parallel processing architecture and so what that means is you can run clusters of nodes, we let you start with one two terabyte nodes, scale up to 116 terabyte nodes for over a petabyte of storage and we automatically redistribute your data and allow you to take advantage of all those resources when running queries. And then from a software architecture perspective, it's a columnar system which means that all data stored in columns on drives and that allows you to really focus on just reading the data that you need to answer a given query, whereas with traditional systems you'd have to read and throw away a lot of redundant columns to get the answers that you want and that columnar system gives us compression which gives us better performance and cost savings as well. So there's a lot of talk in the database world about two topics, no SQL and in memory. You didn't mention those two areas, you didn't go in that direction architecturally. Can you talk about why and is that a trend that you guys see yourselves eventually hopping on? So no SQL and in memory are very much part of the database ecosystem and our philosophy at AWS is that one size doesn't fit all and so with Redshift we wanted to provide a great tool and a great service for data warehousing. We expect and we have customers and partners that will find other tools to fulfill other use cases and we think they'll all play very well together as part of an overall data system. Horses for courses as they say. Talk about some of the use cases and applications for Redshift in the customer base. Absolutely, so we're seeing Redshift being used in a broad range of applications really as a central data repository, customers are generating data in transactional systems as we talked about, getting real time data from no SQL systems and they're bringing that into Redshift to analyze it and we're seeing all sorts of use cases, mobile, social, advertising, analyzing web traffic as well as the more traditional enterprise data warehouse use cases where you're trying to marry operational and business performance data to get a total view of how your business is doing. Yeah, so we were talking off camera, I was talking about Hadoop and you said it's complimentary so can I infer from that essentially Hadoop, Batch, filtering, big giant analytics as a batch job and then take the nuggets and bring them into Redshift is that how you see it working? So the way we're seeing customers use Hadoop is really to process unstructured data to do some of the work that you're talking about at scale and batch analytics and then they're bringing that into Redshift for doing online and more interactive querying using SQL and Redshift allows them and their analysts to use the tools they're familiar with. There's some, when you talk to Hadoop practitioners, some of the guys who were developing Hadoop in the early days, some of the committers, they'll say that Hadoop was really meant to be a batch, it was designed that way and then others of course you're seeing in the industry trying to bring real time, I'll put quotes around that together with Hadoop. You guys, it sounds like with Redshift, you've chosen to separate that out. Is that sort of a long-term strategy? Can you talk about that a little bit? So that's part of our overall AWS philosophy which is we want to design services that are best of breed for their particular use cases and so we have, as you look across database and compute, we have RDS for relational database services, DynamoDB for NoSQL, we have EMR for Hadoop jobs and we have Redshift for data warehousing and so we see improving functionality in each of those independently as being part of our one size doesn't fit all approach to services. Yeah, so you're saying observers should continue to expect high degrees of granularity of services, is that right? So we're really all about building blocks and letting developers and administrators make the best choices for their applications. Yeah, and you're even seeing that now, Andy Jassy was talking about the different types of EC2 instances that you have, workload purpose built for different workloads, so. No, absolutely, and one of them he mentioned, the high storage instance is actually what's underneath Redshift and it's designed for high performance data processing and that philosophy absolutely carries over with the high memory and the high IO, each targeting different use cases and different customer workloads. So talk about that a little bit more. So high IO, high instance for Redshift, what does that mean? What makes it high performance, high IO? Absolutely, so Redshift has two node sizes, essentially the larger one is based on the high storage, eight extra large instance, but has 128 gigs of RAM, 16 cores and 24 drives on it for 16 terabytes of compressed storage and it's really optimized for high performance data processing and so that was the hardware underpinning of Redshift and it's really targeted at that workload and use case. Yeah, so I wonder, Rahul, if you could talk a little bit, we'll shift gears and talk about partners. Talk about partner integration, maybe you could talk about some of the specific partners that are more interesting to you or your customers or things that they're doing with partners, maybe just double click on that a bit. So data warehousing fits into an ecosystem as you've mentioned and really we have partners at all levels, so there's data loading and integration partners that help with bringing data from multiple different sources into AWS and into Redshift. We've got BI partners which help customers make sense of and visualize and analyze the data that they've stored in Redshift and then we also have services and systems integrators who are helping customers design the overall data flows that they need to develop ongoing processes to make sense of the data they generate. Yeah, so we were talking as well off camera about Flash and we're having a sort of a friendly debate about that. You basically made the statement that the spinning disk is the ideal platform for Redshift because of the cost factor. Can you talk about that a little bit? So I don't think I made quite that statement, David, but in general, spinning disk is ideal in this case because we're using really large volumes and lots of drives to develop our high parallel IO performance. And once you start talking about hundreds of terabytes or petabytes of data, it's really the best solution. Yes, it's not economically practical you're saying to put all that in Flash, right? Yeah, 48 terabytes of SSD or memory would, sorry, would cost quite a bit. Yeah, so, okay, so let's see. So where are we with Redshift? When was it announced? Sort of, where are we on the adoption curve? Can you talk about that a little bit? I know it's very steep right now. So Redshift is the fastest growing service at AWS as Andy talked about and it was announced to reinvent in November in a preview and then we launched the service in mid-February in our Virginia region and since then we've rolled it out to Oregon and Dublin with other regions to come in the not too distant future. So we're- Okay, so it's in the U.S. today and a couple of European cities, right? So it's in our Dublin region, our European data center. Okay, and then the intent is to roll it out across the globe, is that right? Absolutely, we plan to do that. How does that work? How do you guys do your rollouts? So in general, as with everything else at AWS it follows our customers and we want to deliver our services in all regions and we follow demand to those regions as we have it. So do you, is there no set pattern? I mean, do you start on the East Coast and then move, or does it just sort of depend on where the customers are? There's typically a variety of factors but customers are always the driving force. But there's nothing to preclude you guys for instance from launching a service overseas? Or is that- Absolutely not. Is that happened before or? Not to my knowledge. Typically not. In general we want to provide a complete offering at all areas in the globe where we have a presence. Yeah, but it takes time to roll it out globally, right? Absolutely. Yeah, so tell me what's exciting to you in this whole database? Well five years ago database was kind of boring and now it's like where all the action is what excites you in database? Well I love the pace of change is pretty spectacular and I love how especially with AWS we're really able to bring a new model of provisioning and dealing with database scalability and performance at price points that didn't really exist. So I think we're solving a lot of age-old problems for our customers in new ways. So what do customers tell you about Redshift that they couldn't get before Redshift? And what are they pounding on you on your to-do list for? So two great questions. So I was at a customer meeting last week when they asked how long it took to provision a Redshift data warehouse. And when we told him 15 minutes he started laughing and was comparing that to weeks or longer that it took him in an existing environment. So I think ease of provisioning scalability and price point are things that come up all the time in terms of wow factors. And the beauty of always focusing on our customers is they're never satisfied. So they're always want to push us to do more to do it faster to do it cheaper. And we're going to follow. So you're not going to give me the road map on the cube I take it? Absolutely not. So that's great Raul. I really appreciate you coming on. Redshift great story, continued innovation and congratulations on, as I say, hitching your wagon to the hottest horse on the track. So thanks for coming on theCUBE. My pleasure. Thanks for having me. All right, everybody. Thanks for watching. Keep it right there with our next guest and my co-host Jeff Frick and I will be covering this all day. The Amazon Web AWS Summit here in Moscone. Keep it right there. This is theCUBE. I'm Dave Vellante. We'll be right back. We looked at all the programs out there and identified a gap in tech news coverage. There are plenty of tech shows that provide new gadgets and talk about the latest in gaming. But those shows are just the tip of the iceberg and we're here for the deep dive. There's a difference between technology consumers and those who live the business day today and our viewers recognize that. The market begged for a program to fill that void. We're not just touting off headlines. Our goal is to provide you with a story but we also want to analyze the big picture and ask the questions that no one else is asking. Our guests aren't just here to provide commentary. We work with analysts who know the industry from the inside out. The tech business isn't new but many networks treat it as if it is and really barely scratch the surface on technology coverage. We follow the expansion of the cloud and the evolution of big data. We're covering new enterprise from startup to IPO and every move in between. So what do you think was the source of this misinformation and so you mentioned briefly that there are several other. If that's the case then why does the world need another software as a service player? I like to think of us as a companion to theCUBE. We're here every morning trying to extract the signal from the noise. Where theCUBE excels in event coverage, we're working to bring that experience to you consistently every morning. We use the top stories of the day to provide you with breaking analysis so that you can forecast future trends. We're here before you even wake up. We're creating a fundamental change in news coverage. Laying the foundation and setting the standard and this is just the beginning.