 Okay, welcome back everyone. This is SiliconANGLE, Wikibon's theCUBE, our flagship program. We go out to the advanced, expect a signal from the noise. Exclusive Amazon Web Services re-invent conference coverage live from the floor. I'm John Furrier, the founder of SiliconANGLE. I'm Joe Mykoz, Dave Vellante, co-founder of wikibon.org. Our next guest is Brennan Seda, infrastructure engineer at Coursera, which is taking the world by storm because online education done in a new way has been something that we're passionate about. Obviously, we do free content, Dave and I, and Dave and I are always talking about the failed project, or the project that they never actually got off the drawing board, Silicon Academy, which was to be kind of like an educational video platform to help people. But I got to ask you, the trend is your friend. Obviously, online education Stanford has had huge success with their online course. I mean, the amount of enrollment or people watching the computer science classes has really shown the way. And quite frankly, disrupting these institutional educational facilities, I mean, they make money from tuition. And they give it away for free. We do give it away for free. We think there's a lot of value in what the universities still provide, and obviously we wouldn't exist without our university partners. But what we wanted is we want to help our university partners expand from simply teaching the students who they can attract and afford and who they can give financial aid for and really expand beyond so that they can teach people who don't have the resources, the means or the time to be able to come online. There's even other students, one we featured on our blog, whose name is Daniel, he's simply, just quite simply can't function in an online classroom. He has autism and he just has trouble carrying on a conversation. He took a modern poetry class on Coursera and it's just been a phenomenal experience for him. We've been able to reach people in ways that we didn't even know we were gonna reach. It's been a really fantastic part of the journey. You know, it's been fun to watch and it's still early days. We're super excited. We're really behind what you're doing. Congratulations on that. But let's get into the geeky side of it, because when you have to roll this out, this demand, obviously partnering with those colleges and universities, they have legacy stuff. But generally speaking, you got to put it out there as a service, right? So you guys are doing that. So walk us through some of the challenges that you guys did. Okay, there's a lot of demand for this that all of a sudden the tsunami of online activity. How did you guys do architect your solution? What did you guys do with Amazon? What in the cloud made it sing and how did you wire it together? Yeah, so we started from typical humble beginnings as a PHP website written by a bunch of grad students and with some undergrad help. And we really started from there. We've been moving away and really changing around our underlying stack to make things a lot more powerful and a lot more user friendly. So for example, previously, we're actually right now on Coursera. You can't, it's really hard to, for example, see a list of all your class upcoming assignments and that's something we're working on changing. We're doing a lot of things underneath the hood. But really starting from the beginning, we didn't have any very much funding when we were starting in fall of 2011. It was a little bit before my time. But Professor Andrew Ng and Daphne and some of the other Stanford faculty, they wanted to run this experiment. And we got started with Amazon and we haven't looked back. It's been really phenomenal. Amazon has allowed us to scale our capacity in ways that we never thought would be possible. I actually just looked on our bill from last month and we're serving almost a petabyte of traffic through CloudFront every single month. So are you storing the videos up there? Are you hosting them on, is it all cloud, any bare metal at all, any on, any? The only bare metal we have are our developer laptops. Everything is in the cloud, as it were, on S3, CloudFront, EC2. Dave and I always talk, I always say, DevOps guys are like, they eat glass. I mean, they're like a unique breed. And they're usually young guys, right, like yourself. Us old dudes like me, we used to load patches, load software, go in and put a disk drive in, or CD-ROM. But now the new model of coding is to push code and the DevOps philosophy is a home run, fully integrated stack, updates across the code bases. So I want you to talk about how that influences you guys. How do you guys do development? Is it pure DevOps, is there a lot of agile? Just take us through how you guys do the coding. That's a great question. So the infrastructure team at Coursera, up until a few months ago, has really been comprised of only two full-time engineers. Yet we're managing a really massive scale over five million students. Our students are more active than you would see in a lot of different situations, different environments. So even though the number of students we have isn't necessarily as big as some of the other big websites out there, there's a lot of load and a lot of students. And we quite frankly can't manage all the servers individually. So we very quickly move to having everything be an auto-scaling group. Everything is installed and managed in very much a DevOps, even so much a NoOps. We've moved beyond to some degree some of the tools like Puppet, simply because they're hard to work with. There's an impedance mismatch between Puppet and an auto-scaling where machines come and go all the time. And so we've actually to some degree even moved beyond that. As for our development and methodologies. Do you use auto-scaling? Like how do you handle the auto-scaling? Do you use elastic beanstalk? Or what are you guys using? No, we use, so we have our own auto, so we auto-scale in EC2 based on CPU usage for our front ends and for our back ends. We've just been over provisioning to some degree, but auto-scaling really gives us the flexibility to shoot a machine in the head and have it come back automatically without us having to do any intervention. And actually just this morning we got a bad instance. I was able to just terminate it right from the console and everything immediately recovered as traffic flowed to the newly created instance. Shooting it in the head versus what? Shooting it in the heart, I mean. Just terminating it. Kill that machine, everything rolls over, all the software's there. That's right. We don't have the time to deal with machines and individuals, we have to just treat them in bulk. So talk about some of the tools that you're using or some of the services. Yeah, so we've taken a lot of inspiration from Netflix, specifically their Asgard deployment model. And we've actually implemented some of our own tools in-house. I've written some of my own tools in-house to help us deploy codes. We started by manually going through and updating the code on all the instances and we really want to have a much more rapid development process as we're trying to handle the rapidly increasing load. So I wrote a shell script that made things pretty automated but we've since moved to a web-based console and since we've moved to that we've gone from deploying every few days at the beginning to we now deploy five, six, seven times a day and it's been a huge boon for developer productivity. It's been a huge boon for reliability. We now have the ability to do a fast rollback. If we push bad code we can quickly roll it back in about 10 seconds. It's made everything work a whole lot smoother. I've written about it a little bit on my blog, betaCS.pro. So talk about a little bit more about the services that you're using, maybe how you're protecting your data. Yeah, so our primary data store is Amazon RDS. We obviously follow all the typical best security practices to manage Amazon RDS. Underneath, we use actually a whole plethora of Amazon services and that's simply because we don't necessarily care about how it gets done. We need to get it done quickly. We have so many demands on us. We have so many features we want to support. We don't have time to build our own queue service so we use either Apache Kafka or we use Amazon SQS. We use Amazon SES, SNS. We use S3, Amazon CloudFront. I can list off the whole alphabet soup of everything we use. An intern, Brian, he mentioned that he misses talking about Amazon SQS and the security groups on EC2 with the auto-scaling, cloud formation, cloud front stuff. Anyway, it was pretty funny. We use a whole alphabet soup of Amazon services. So can you, Matt? How about just one more question? So how do you protect stuff, back it up? Talk about that a little bit. That's right, so we use typical, so all of our data is either in S3, which has massive availability. We back stuff up to Glacier every once in a while, something we should do more frequently. We have all our data in RDS, which has daily incremental backups, and finally we use, we have a little bit of data in some NoSQL stores that we also back up to S3. Something that we actually want to work on and we're very glad that the Amazon RDS team just released the cross-region copy snapshots. We want to back stuff up outside of the US East region that we currently reside in. Why don't you use Glacier more? You haven't automated the Glacier archiving? Yeah, for us to use it, we have to automate it and we just need time to go through and make that so. Have you ever had to recover from Glacier? We have not had to recover from Glacier, but luckily we've turned on Amazon S3 bucket versioning so that if a developer accidentally deletes the wrong thing, we're able to quickly recover it and we've actually deprivileged all of our dev keys so that even if someone malicious were to take that stuff, they wouldn't be able to permanently get rid of our data and we just trust in Amazon and their ridiculous number of nines of durability for S3. So what are some of the things that you want Amazon to do, some of the things that make your life easier? So cheaper is always a great thing. Well, it looks like they're working on that. They are, which is really great. To a larger degree, I was actually talking with our account manager, Anne, as we were trying to set up our interviews, that Amazon and the Amazon infrastructure that we're using now has been quite reliable. We're really happy with it. We actually were trying to figure out what feature requests and things we'd ask the individual teams as we met with them. The only things I could think of would be related to getting the web to move over to HDB 2.0 and allow us to use and take advantage of the protocols so we can build more reactive and powerful web applications more sophisticated without all the burdensome of trying to manage different cloud front distributions and caching and all that sort of nonsense bundling. And so really just the biggest projects that we have that we would like to see are sort of blocked by not Amazon. Amazon's been great for us. What have you learned over the past experiences with Amazon? Obviously we use it as well. We have, you know, for our crowd chat application and our crowd spots platform, we love it. I mean, so for us, I don't want to get into my own rhetoric but I want to ask you about, I mean, what do you think of the service? I mean, can you imagine living without it and just share the folks out there what's it like programming on Amazon as a developer but not an ops guy? What's it, what's, what's it like? So from a developer perspective, APIs make things so much easier. So I would not, even though I end up doing a lot of the operations in a day to day I don't really consider myself an operations guy being able to have, being able to drive everything via APIs and automate everything has been a huge boon. That said, if you're a developer you still have to realize that Amazon and large deployments and large infrastructure doesn't behave the same way a small colo does. So as, for example, to make things more concrete, when we were initially deploying our new stack that runs on the JVM and using connection pools, our connection pools were quite simply unreliable. We had to do a fair amount of tuning and that's just simply because the network is hostile. But if you're in a typical small colo or other, you're on private data center you have a few switches in between your app servers and your database servers. Within Amazon you've got VPC, you've got cross regions, you've got, or cross availability zones. You've got a whole much more nonsense in the way between your app servers and your database servers. You've got to tune accordingly and deal with things accordingly. Always know that things are just going to go away. So I got to ask you how well does it integrate into Git? And how do you maintain multiple versions of applications? You know it's easy to scale up and down but the auto configuring, you're making sure the right version, how do you guys handle that? I mean, do you override things and how do you, what do you do, what buttons do you push? That's a great question. So the way our deployment process goes is developers work on their own branches. We have every developer is able to run the entire stack on their laptops locally and you can develop locally in the different branches. When they're ready to test we have a bunch of different environments, staging environments where developers can test out their code within Amazon and can be QA'd and that sort of stuff. And this is all driven and automated by Jenkins. It's totally click button, make it so type tooling. And then finally when they're ready it gets merged into master and then developers then deploy their master out into production and that's the way it goes. So one of the things that we had, we basically built our own Redis cluster because Amazon didn't ship theirs yet. It's really not stable yet, but they're getting better. But now they shipped it, so it's ironic we had to build our own Redis prior to them launching it today. I don't know if you know that. Got to get in the weeds here. But give an example of what you guys have done that Amazon hasn't done yet that you had to write for code. And is there anything that you've done and then they've come on right after? Because they're pretty innovative. They're adding new services. You saw the chart. I don't know if you saw the chart in the keynote but Andy Jassy's showing, they're deploying more and more goodness into the stack. Yeah, absolutely. What have you guys written that's been your own code? Yeah, so we started, so instructors upload videos, their source videos as they try and provide content for our students to watch and learn from. And this is before the Amazon Elastic Transcoder project existed, we built our own encoding process. We're really happy to shut that down. I actually didn't go with Amazon Elastic Transcoder. We went with a different third party. But yeah, there've been quite a few things that Amazon has built in and we'd be like, ah, if only we had that sooner, we wouldn't have to do a whole bunch of work. Yeah, running, hurry up and catch up, right? So slow down. So they are innovating. What are they working on that you think is important that they need to have that's on their to-do list? I mean, it's not slim dunking them because they are moving fast. Yeah, one way ahead of anyone else as far as we're concerned. The one that I had is I'd like to be able to see what API calls have been made by what keys through the Amazon services. And I just came out with Amazon Cloud something or other to keep track of do security auditing of what API calls have been made and what changes have happened. I actually asked for this auto support ticket just a few weeks ago and they were like, sorry, we don't have anything available for you and now I have my answer. Yeah, I was talking to Andy Jassy and I said one of the things that they should do is integrate their data warehouse product, Redshift into our log data. We get a lot of notifications from like Node.js. I just want to get in there and work with that. So that's something that they don't have. We'd like to see that. But for the most part, you know, pushing out new code really trickles through the stack. It's really effective. What's the biggest home run that you guys sit back and saying, you know, this is the killer thing for us. What is that killer thing for your business online with Amazon that really makes everything work? Yeah, so being able to just rapidly deploy, scale up and scale down, but I really want to talk about deployment. I talked briefly about it earlier, just moving to this, using the programmatic APIs that Amazon makes available. We've built our own tooling on top of AWS and that's just resulted in a huge boon in developer productivity. Coursera, we're heavily strapped on engineering talent, engineering manpower. If you're interested, come join us. Come drop me a line. But, Which is Twitter handle? At B-SATA, B-S-A-E-T-A on Twitter. And my email is just simply my last name at Coursera.org, S-A-E-T-A at Coursera.org. Great. Being able to control Amazon and rapidly deploy new code has been a huge boon for developer productivity, which is arguably the single most important priority for the infrastructure team. And that is to enable all the other teams to work that much more rapidly and develop that much more quickly. It's just been wonderful. Well, you got 48 followers on Twitter, you just got 49, because I just followed you on Twitter. Excellent. And thanks so much for coming on theCUBE. Really great to see you and again, not the, because I'm old compared to you, young guns, DevOps guys, eating glass, spitting out nails, as we say. But the world's going there. So, you know, I think you guys are a great example of what is the future. And certainly Amazon, seven years old, growing up real fast. We love it. We love it. It's great stuff. Let's see if the enterprise is like it, Dave. So we'll be back with more coverage right after this short break. This is theCUBE, live in Las Vegas on the ground floor here at Amazon Reinvent Conference. We'll be right back.