 So, my name is Jeff Johnson, for those of you who weren't just in this room. I am a software engineer working on Cloud Foundry at Google. You can find me at Jeff's Tech Tips on Twitter. I joined Google about eight months ago, and I've been working on Cloud Foundry full-time ever since. But before I joined Google, I, of course, had an interview. And when you interview at a place like Google, you read this book. This becomes the thing you read from top to bottom. You do it so that you can learn how to draw these really fast and how to invert them. And once you feel confident enough, you say, OK, let's do it. Take me on site. Put me in front of a whiteboard. And of course, the first thing they do is they give you the 30-minute design question. That's the design question that you can sort of prepare for. Now, the one I got looked like this. I want to sell tickets to an event. Obviously, that's not a question. First off, you're not really in a strong position, though. You're like, OK, sure, I'll help you. Now, the simple naive solution is what you start with. Start with something simple. Let's try to make it big. The naive solution looks something like this. Ticket booths have been around since there've been tickets. They're a very reliable way to sell tickets, and it works just fine. I really did want the job, so I didn't say the solution. But let's think about it. What does a ticket booth provide? Well, a ticket booth is trustworthy. When I say that, I mean, when I hand them money, they hand me back a ticket. I trust that that is my ticket. They trust that is my money. And I trust no one is going to try to take my seat, OK? They haven't given this ticket to two people. They have a physical stack of them that they're handing to me. A ticket booth can support a huge event. And when I say big, let's say 100,000 seats, I assume there's venues that big. But 100,000 tickets is like it's just some space, right? You could build a little bit bigger booth and fit them. It's not that bad. Just stack them one on top of the other one. The problem with this answer, and the reason why it's a naive solution, though, is it's not answering the full question, OK? We have a big event, 100,000 tickets. Well, we probably want to sell these globally. So let's say it's the Incink reunion tour, OK? And we want to sell them in the US. We want people in Japan to be able to buy them. You're up everywhere. This ticket booth is not going to cut it. You could say, well, just build one over there. So we'll build another one in Japan. To do that, though, you'd have to take that half of those tickets, and you could put them over there in Japan. But what that means is you're not going to be able to efficiently allocate these tickets. So let's say it's a huge hit in Japan. And up in Washington, no one is really that interested. You've got 50,000 left. But you could have easily sold those if they were just in the right location at the right time. It's hard to make that decision. It's not very efficient. Sorry, because of that reason. It's not very fast. If I walk up and I see 10,000 people in line, I'm gone. It's not going to happen. So all right, real solution, right? Not much time left. These interviews are very fast. The real solution, the real naive solution, is something we used to just call a web application or a website. So when I started working on websites, it was PHP and MySQL. It was a single Linux box. I installed all the pieces. And I think my dad let me put the box in his office so I could get free bandwidth, sort of a closet. And you know what? That works great. That works. That's trustworthy. There's a single database. There's a single source of the truth. We can do transactions. I probably didn't know about them at the time. But you could do transactions, which means, OK, no one is going to grab that ticket while I'm grabbing it. It's globally accessible because it's online. I don't know if you guys have been online. But if you've been online, you can get to any site. It doesn't matter where it is. It's efficient. We're not going to have that problem of, well, if I'm in Japan, I can't pick that piece of paper up. But it's not going to support a big event. A big event means we've got thousands of people around the world trying to buy these tickets. A big event means I wrote a bot that's going to try to buy these tickets and sell them on StubHub. That single machine that I have running in my dad's office is going to fold. It can't even queue people up. All the TCP connections are going to shut it down. Now we're at the Cloud Foundry Summit. So we've got to talk about the cloud-native solution. This is the real solution that I wish I could have done. A cloud-native solution thinks about the web. Sorry, it thinks about compute as a service. It thinks about data as a service. So in the perfect cloud-native picture, I say, here's my app. Here's my way I sell tickets. Put it as close to people as you can. Put an instance in the US. Put some in Japan. Put some in Europe. So we take the speed of light out of the equation, or at least make it less significant. That's pretty easy to do with Cloud Foundry. And here's a sketch of how I've done it on my project. So what I have is three completely separate Cloud Foundry instances. Those three completely separate instances are hosted in different regions in the world. What this allows me to do as a developer is say, here's my code. Run it. I won't quote Taikoo because I won't get a perfect, but here it is. I don't want to reinvent the wheel of how to deploy an application around the world. In front of that, I would put a global load balancer. The global load balancer is more of a term for a product than an actual load balancer. When we say load balancer, I don't mean an nginxbox or haproxy. It means the Google front ends at the edge of our Google data centers. Those are the same load balancers that are powering YouTube ads, things that a few people are using per day. What we can do in front of them is to stick a multicast IP. Multicast means we can respond to an address from many locations. So I have this IP, and for me, up in Washington, I dial it. It puts me to a data center in Oregon. Across the world, I dial it, and it puts me to a data center in Japan. Once it gets into the Google data center, it's on Google's private network. The private network is, in a lot of ways, a second internet, but without capacity issues and without a bit of the noise. Who's talking? My Google assistant has just popped up, by the way. So I'm going to try not to use the word G-O-O-G-A-L-E, I guess. So that Google front end doesn't just know about it doesn't just, sorry, the Google front end can do a lot of fancy stuff with traffic. So imagine you're running an application like YouTube. You're not going to say, all right, you're in Japan. OK, well, this data is over here in the US. So I'm just going to route that traffic, because we're talking about gigabytes, one gigabyte. It's a lot smarter than that. It can say, I know where I am. I know where this traffic is coming from. And I have a pool of healthy nodes really close by that I can send your traffic to. So it's going to send the traffic to whatever's the lowest latency and whatever's healthy. Now all roads here point to this ominous question mark. It wasn't too challenging to get these three Cloud Foundries deployed, at least if you've deployed Cloud Foundry a few hundred times. But once you have it, not that bad, right? But where does that database come from? The state is really what makes the sale, right? A web application is great. It's a great way to do things. But at the end, we're just moving data around, right? At the end, all transactions are just data from here to here. And that's what we're really interested in. Out of a database, there's a few set of things that I need to deliver something that hit the points of the ticket box, but at a global scale. I need transactions. I need to be able to definitively say, when I got a ticket, it's my ticket. No one else could get it. I need high availability and throughput for that initial surge of people coming in to buy tickets. I need global access so I can reach the world with this. And ideally, I would like to scale my solution with the load. We can scale our applications with Cloud Foundry pretty easily. You just say, yeah, I want 16 instances now. And no, I just need one instance. Ideally, my database can do the same thing. And I don't have to think about it too much. Now, all of these classes of problems, there's tons of solutions to them. There's tons of ways to change your application to deal with them. Transactions, for example, you could stick your messages in a queue and have a central processing thing that pulls them out. But in a lot of ways, that's just creating a new funnel somewhere else. You can charge your data across the world and say, OK, these tickets in Europe are the ones we're going to look at. But we're starting to think about the same issues you have moving multiple ticket booths. You can have read-only nodes and write nodes. But as soon as you do that, your application has now really changed a bit. It says, I connect to a totally different machine to think about how to read this data. And once I'm finally ready, I go find this other special one. I see these all as workarounds to the traditional relational database. And there's endless ways to do it. But we want cloud native, right? We want just a service. Give me a database. It should be all over the world. It should be magical, right? Not possible. Of course it is. Spanner is the database service that powers AdWords, Photos, and essentially Google. There are some in-depth research papers that I cannot make it through. But if you're interested in how the time ticks underneath the hood, how the Google network plays a role in all this, and how that all fits together, we've been publishing papers about that well long before we've released a cloud product. Let's look at the cloud product, and what it can do for developers. First it does the expected. You have a SQL language for getting data out that's got joins, the expected indices. And you've also got ACID transactions. So you've got guaranteed transactions. But what's awesome about that is with Spanner, you can get single digit latencies up to five nines of availability. And it's all backed by Google's SRE team. So there's a lot of database solutions out there. An example is SCD, which we use for some of our projects. But once you have these databases, use Quorum to sort of spread out things and do transactions and do committing. But maintaining those things are incredibly hard. There's a few talks here where Cloud Foundry is actually switching out of using these sort of distributed database systems. Purely for the reason, a big part of the reason is because they're just so hard to manage. We've got an entire team of SREs doing that, and they're also doing it for AdWords. So they're paying attention. Under the hood, Spanner is doing a lot of the workarounds we talked about. So it's doing sharding. It can do scaling just by flipping a knob. It's doing replication, but you don't worry about it. You can't even, it's not visible to you, a lot of those aspects. And what I think is the coolest part is you use it as an API. So you look, you just ask for Spanner. You don't have to say, this is my Spanner server. This is the IP address that I talked to, this specific Spanner. That's abstracted away. It's not something you need to worry about. I'm gonna dive into an example here, but before I do, I wanna talk about just some of the transactional concepts that Spanner has. There's a single read, that's exactly what you think it is, just read a piece of data. There's a read transaction. A read transaction is where you're gonna read many things at once. The read transaction is not gonna lock your database and freeze it at a state in the sense of others can't write to it, but it is gonna be a snapshot in time where all your data is one way. So in a infinitesimally small point in time, this is the snapshot of my data. You can do that in a transaction without any other performance effects. The last bit is a read-write transaction. The read-write transaction is those ACID guarantees. It does pessimistic locking, ooh, locking, it's a mouthful. I encourage you to read the paper. Two phase commits and it can potentially retry actions. So if things have changed upstream that it's depended on, it can retry those things. What I like about all this stuff is you don't have to understand it in depth to get moving with it. You don't have to understand how true time works, how all these cons, how we're in some ways capped, but maybe not A, but five nines of A. You don't need to understand that. You can get started with Spanner right now. So I'm gonna show you an application that I whipped together to be my answer to that interview question that maybe didn't go as well at the time. And I'm gonna switch to my mic number two, which is exciting. So this ticket application is very simple. It's kind of a REST API, but what it can do for me is it can tell me, okay, what events do you see, what seats are available, and we can sort of do a purchase. So let's look at, I just have some aliases here because the host names are a bit long. Just to show you, there's nothing on my sleeve. These are just curls to a app I have running. And the app that I'm curling is my global load balancer. So the way I have this deployed is I could hit a regional back end or front end if I wanted to, but I'm using Google's load balancer to say, okay, give me the closest one. What's available? So when I show my tickets, I can then show the seats available, potentially. It's REST leg. Okay, so I have a list of seats. Now, not all seats are equal. Some seats are like the ones you guys are in that could be reserved ahead of time. A single seat, no one can stand there, but there's also sets of seats that are just kind of a pool of people sort of mashed around each other. General admission as they call it. They have their own sort of unique problems, but we're able to deal with them much in the same way here. So I'm gonna just go ahead and buy the general admission seat or sorry, the reserve seat. This is where it gets real demo. I should be using my Vim keys, but not in front of this audience. Okay, so I sent a post. Of course, I got back a GUID because it's the cloud, but trust me, that is a successful purchase. If I looked for it, we could see that ideally, that no longer has any available for that specific ticket. And if I try to purchase again, it's not gonna work. Okay, not that big of a surprise. Now, where that gets interesting though is when we have thousands of people doing this at once. So I'm gonna go ahead and spin up a few machines on GCP to start purchasing tickets on my GA. And it will look at how that goes. So let me do a watch just to see what seats are there. So what this is doing is just launching up GCP preemptible instances with a single core. So our preemptible instances mean it could get deleted at any time or sorry, shut off at any time, interrupted for other work. But what it also means is they're significantly cheaper. I won't say a percentage right now, but all I care about is I've got a bunch of machines curling this API and sort of buying tickets. As this spins up, I wanna take a look at what the code looks like. So this is a simple go web app. And I'm gonna look at that seats method, or that seats. So to talk to Spanner, I'm using the Spanner SDK. I'm providing it a SQL query, nothing very fancy there. And I'm pulling data out of it. So this is just a single read that we talked about, that sort of first class of transaction. To date, there is not a Spanner ORM, but I'm interested in one if anyone else is interested. But this kind of simplifies that and just shows you what's really happening underneath the hood. So I'm doing a select from Spanner, pulling the data out, unpacking it, and just constructing an object in a list. Here's where the transaction happens. The way Spanner does a transaction, at least in Go here, is we start with a function. We give it a function and say, within this function, execute this code in a single transaction. What Spanner's gonna do is it's gonna keep track of all the things that I'm reading. That means all the things I'm making assumptions based on and all the things that I wanna write based on that. If things change while that's happening, if people are able to update the things that I'm wondering about, then it will try to replay that transaction. So on this line here, right under this, because I think it'll become eligible once I do that. On this line here, I'm just reading a row, just as I did in sort of that list, but I'm doing it along on that transaction. And the row I'm reading is just to see, okay, how many seats are available? I unpack the requests, just unpacking variables. Kinda know where M would be nice here. And I make a decision. If there's no seats available, that's an error. We're done. We're not gonna try this transaction again. It's over, it was a failure. That's fine. Not everybody can sit in one seat. But then I make a right as part of that transaction. And I'm making two rights. The one right I'm making is to update that seats available. So I'm decrementing it. I'm removing one seat available. For the single occupancy seats, there was just one. For GA, there's 100,000 or something. And I'm saying that one is gone. There's one less. Spanner's not using a decrement operator. It's not pushing a message because a decrement operator doesn't really guarantee that it's going to the value that I expect it to be going to. I'm literally just saying seats available, minus one. It's critical that this is wrapped in a transaction that says if this data changed, then don't commit this. This is dangerous to commit because if we had two of these machines going at once, which we should have a 20 or something hitting our Cloud Foundry. If we had two of them doing it at once, then they would both commit the same value. Not acceptable here. So all that mutation is wrapped and buffered in that single read, write and committed. We know how it does based on the output of this. So when it's done, we're gonna have an error if it was unsuccessful, or we're gonna go ahead and be able to use our purchase ID and we're gonna write an entry in that purchase table. So this is sort of how we denote success for this. So let's see how that's going. Thank you. Okay, so we can see that count here is lowering and it's just periodically refreshing and we're pooling it. And that's getting pooled in from all over the world. Let's confirm that all over the world thing because that's kind of a bold claim. So I'm gonna hop into the GCP console. We'll just look at that load balancer. So now we're looking at that specific backend service that's serving our load balancer. We can see the volume of where our traffic is coming from and where the traffic is going to. So on the left hand side here, maybe I can switch to this, there's a lot to point at. So right here we have our front end location. That's where those GFEs are, that traffic that's serving YouTube. It's now serving our ticket sales and we're making a lot of money. So we have more traffic here coming from Asia. We have more of those machines spun up. And you can see it's routed now directly to this Asia backend that I created. So that's that one cloud foundry, one of the cloud foundries in that list. All the other traffic is able to route just directly to whatever's closest to it. And this represents the actual flow from the data entered to where it's getting processed at. All I had to do for this was put a multi-cast IP. There's no real magic beyond that. All we need to do is say, okay, this DNS record is this multi-cast IP and our GFE knows how to handle that. So I showed you before that the way we represent purchases is a little rough. We are currently just giving you a purchase GUID. But without doing much development, you can start to look at that data. So what Spanner gives you in the cloud console is a nice little query spot and way to look at your database. So here's my Spanner instance. This Spanner instance just has a single node. So if I needed to scale it, I could look at this information about, you know, how busy this information about how utilized that Spanner node is and just change a box here. What happens if we do just 10? There we go. We just get 10 Spanner nodes. I'll need to shut that off because I think that costs money. But I have my database here and I can do tick a view at the schema as you'd expect. And I can also do queries. So you can look at the purchases. Nothing like write and SQL on stage, okay. That's a simple one though. So you could store a lot more interesting information besides GUIDs, but what I've stored here is what event is it? What seat is it? And what was the purchase ID? As I was building this, one of the things that the first thing I reached for when I was thinking about IDs was, well, IDs, that means auto increment int. That doesn't really make sense. There's really no need to do that with Spanner and Spanner doesn't have support for auto increment int. If you wanted to, you could implement it. You could have a numbering service, but that really, as you think about it, you go, well, why do I want that? I really just want that because that's what I'm used to doing things. I'm used to having it in that sequential order. But sequence is really more about time when you're looking at data like this and you're looking at a database like this. Yeah, I think we're almost down here. So how can you get started today with Spanner? You can check out Google Cloud. You can create a Spanner instance through the GCP service broker, so available in your favorite CF marketplace where service broker available. And the thing I really recommend checking out in what I based a lot of the code for this talk on is the getting started with Cloud Spanner and Go. That is just in the GCP documentation. Incredible, incredible example and really gets you through a lot of the core concepts. One thing I really want to talk about is a feature that is coming in 2017 of Spanner and that is cross-region replication. So I showed you Spanner and I said that you can access it anywhere, which is true. I did some of my development when I was in a plane and I was able to hit it. But for those real performance critical applications, Spanner will soon be able to do replication across GCP regions, so it'll be even faster. The nice thing is today, you have the GCP network that it's gonna go over, so you don't really see it quite as bad as you would in a lot of ways. But soon this year, we'll see it across region replication without any work on your end. Probably just tick a box. So thank you all. I'm happy to take any questions after and I'll be around tomorrow as well.