 I guess this would be the, we have the lot, this is the Swiss stuff. So today, so just kind of like how it's, so today there's a bunch of the Swift technical tracks too, just so you know. And those are in the, where people are talking about the next versions of Swift, what's gonna be available and that. And those are happening right now, not that I'd ask you to leave or anything, but those are pretty much queued up all day today. We also, besides this track, right before this one, there was another Swift session about how to store some objects. And then this afternoon at 4.30 is a workshop on installing Swift. So if you're interested in that session, welcome everyone. That'll be this afternoon at 4.30. And I think it's just right next door here. So if you wanna go through the process of getting Swift up and running and we'll go through that whole process. We'll actually do it in two ways. We will walk through the command line installation steps and then we'll go blow through an automated install of it. So that's what we have going on this afternoon at 4.30. Last bit of housekeeping, there's a book, shouldn't say it now, but afterwards if you wanna pick up a book on Swift, if you have it, there's some, having that or booth, we have a few here for those who are interested. Again, walks through how do you Swift, how to install it, things like that. So hello, my name is Joe Arnold. I'm the CEO of Swift Stack and I'm here with Dan Wilson. Dan, can you introduce yourself? Sure, yeah. Hi, I'm Dan Wilson. I run the Global Infrastructure Architecture Team at Concur. My role there is to transform our data centers from monolithic to highly available scale-out architectures. And so through this talk, what we're gonna do today is we're gonna just introduce what Concur does. A little more detail, introduce Swift Stack and what we do. Then we'll run through a super high-level introduction of OpenStack Swift, just to orient us. Then we'll start to get more into Dan's use case at Concur, the problems that he was trying to solve and then how he went about solving it, some of the pros and cons of the existing and the new system. And then we will talk about the global replication functionality available in Swift and some of the benefits it has. So let's get started at Concur, yeah. So for anyone that doesn't know Concur, we automate business travel and expense claim reimbursement. We also have Tripit as our property and we're growing very rapidly. There's quite a number of people in the Fortune 500 that are using our systems, but believe it or not, we've grown more in the last year in a small business segment than any other segment of our business. And Swift, so Swift is a object storage system. It's created specifically for unstructured data. So that means not databases and running operating systems. It means everything else, so that's documents, images, videos, log files, snapshots. It kind of goes on and on, but it's a lot of the bulk of a lot of the data that's out there. It's designed around principles around being highly reliable, scalable, and being able to deploy on lots of different types of hardware, typically commodity-based hardware. And Swift Stack, what we are, is we provide a product around OpenStack Swift. We like to make it easy to get started and be able to scale really easily, so you can add nodes quickly into environment, and then also be able to deploy on lots of different types of hardware. So what we have is two components. On the right-hand side, we have something we call Swift Stack Nodes, which is a Swift runtime, and that includes UI for users so they can get access to it, a load balancing or load balancing integrations, a few different authentication options like LDAP, Active Directory, and then Swift itself. So we are, the company has, we have the project technical lead. We also have a lot of the core contributors to the project, so we're able to drive a lot of the innovation and do a lot of the fixing in Swift itself, and then we install on standard Linux distributions of Unto, Red Hat, CentOS, that can be blown on all sorts of standard hardware. Then we have a controller which helps manage, deploy, operate, scale, scale the environment, so that's what we piece together. Okay, Swift introduction. So we talked about reliability, reliable, highly scalable, and hardware-proof. Really what that means is it's a, it's an object storage system, means you can reference the data via a URL over HTTP. It's not blocks, it's not a file system, and when the data goes in via HTTP, it gets distributed internally using a consistent hash ring to place data across different nodes in the cluster. It gets configured with zones and regions which we'll get into later. It's an HTTP API, so building applications around it. It's high concurrency, supports lots of users. Also then it's highly scalable, it's multi-tenant, so you can have lots of user accounts, and then it can be tiered out, so each layer in the architecture can be scaled independently. It's designed not to have a single point of failure. Any node can get plucked in or dropped in to that environment, and if it goes away, failures get routed around. And we assume, as a philosophy, having unreliable hardware that we run on top of. And this happens all the time in the data centers, of course. And I've even gone and unplugged, unplugged, unplugged things around. We've had lots of disaster strike, but it's designed so that it's okay if those failures happen in system. That exists in the software. And we've actually, we do mix and match different hardware vendors underneath the system. So how are people using this? They're using this. One, they have applications that talk directly to the storage system, building clients, and that's what Dan was doing with Concur. So there's all sorts of language support available. But then there's also other services in software like file system gateways, desktop clients, backups are far, archiving, file sharing, things like that, that people just use out of the box. And then in the data center, people will put data in and out of that as part of backup jobs, application data, log data, things like that. The project started in 2009 with Rackspace starting to, want to build a competitor to Amazon S3. So they built something called Rackspace Cloud Files. It got open sourced as part of the OpenStack, one of the first two projects. And over the years, there's been lots of additional things that have happened with the ecosystem. We've amassed a lot of developers, over 136 developers who have worked on the project and it's deployed in lots of places and there's lots of contributors. So it runs some of the largest storage clouds that are out there, including of course Rackspaces, but also IBM SoftLayer, HP's cloud run on OpenStack Swift. So that's in a nutshell a bit about OpenStack Swift onto the use case here. Okay Dan, so what does Concur do and why does it need Swift? Well, Concur, one of our critical parts of the Concur system is where you can take a picture of a receipt while you're traveling, upload it into the Concur system so that it's tracked and stored. And we do some processing against it as well. We have a new feature now where we're OCRing the data on there and creating the expense entries for you so you don't have to actually fill out your expense report. Just kind of nice. But the history there is that actually before we had mobile, you had to tape the receipts to a piece of paper, up a fax it in to our system and then we would convert it to a PDF and store it on our file system. So that's what we wrote our storage for. It was that type of a design. So it's changed a little bit. All right, so. Now it's scaled up quite a bit as well. Like a year ago we were processing 150,000 images a day and now on an average day we're processing 1.5 million images. I'll walk through for us the, what did the architecture previously look like? So this is a rough picture of what it looked like before. Essentially every machine that was doing image processing would pick up its work from a queue but it wasn't just the image that was being processed that picked up from the queue, it was the step that it had to process. So for example, the scan step there, it knew it had to scan the barcode off of the image. So it pulled the image off of our Windows file share, ran some customized code to look for the code 39 barcode on the fax and then once it recognized the barcode it would throw it away and store information about the barcode back to the storage system. Then it would tell the queue that it's ready for the next step and any of the servers in our processing tier could then pick up the next step and it would once again, pull the image off of the storage system. So you can see there's a lot of back and forth on our network, it wasn't very streamlined or optimized. And we decided, since we had to fix that queue tier and the processing tier anyway, it was the perfect time to bring in an object based storage system that we could give us the internet scale that we really needed here. So this is the way it works now. Uploads and faxes, you can see entering in on the left there, it's our queue tier that we can scale out individually. Each one of the imaging servers are processing servers and they get their work from the queue and the picture below shows the steps that they perform once they get their object off the queue. They linearly go through scan, convert, thumb, reduce OCR and future steps and as they perform their work, the data about the image or the files that are produced get stored to Swift stack and a Swift stack once again can be independently scaled. Can you walk through the encryption that's needed for this application? Yeah, encryption's really important to us. We have a very unique encryption design where there's three keys and assault and two of those keys are dynamic. So every image has its own key that's stored to the storage system. What's important about that is if anybody ever got the data and brute-forced that object, then they would only get the data for that one object. They would not be able to reverse engineer the key and reuse that for other data in the cluster. They'd have to go through that brute-forcing technique for every single object. So probably pretty important for financial data, all right. So previously we were storing the stuff before, so walk through some of the issues that you found with availability performance and cost. Yep, like I mentioned, we had a window file share based technology and even though it was window file share, it was a grid architecture, it was designed for high availability, but it wasn't designed for failure. We ran into a number of situations where a single node would run into an issue where it wasn't completely down. It was just down enough to take the whole cluster down and the system would not handle it. It just didn't work for us. Okay, and then performance characteristics in terms of how we had the system set up. Yeah, and once again, on the performance side, windows file share based technology didn't give us the ability to allow a single server to communicate with every node in the cluster. A server got pinned to a node because it was relying on that SIFS SMB protocol. What would happen is over time as those servers were recycled or added or removed for capacity, we would end up with hotspots in our storage cluster because more nodes or more processing servers were being pinned to a single node. So it's just not very scalable for us. Cost. Yeah, cost, I mean obviously when you're using a vendor's solution, you don't have much flexibility in cost. You can negotiate with them, that's about it. And then being a black box is frustrating as well because I talked about that issue where the node went down and took the whole cluster down. Well, there's one of those situations where we had a stranger message and didn't really tell us a whole lot about what the problem is. We couldn't dig into the code. We had to collect a ton of logs and send it to support, and they still don't know what the problem is. Okay, all right, so we transitioned now to Swift. Okay, so what's the delta here in terms of for availability and performance, and then walk us through the configuration that you have with the multiple readings that you have now with Swift? Well do. We have two regions. We have multiple zones in each region, as you can see here. It's really important for us because it gives us that resiliency across multiple data centers. If a single DC goes down, we're still up and running. It is designed for failure, so at the latest hint of a problem, it's gonna drop a whole node down and keep the system up and running instead of just dying. I'd much rather have it cut off an arm and stay alive than take everything down. And then on the performance side, in each data center, as data's getting written to that data center, it sends an acknowledgement back to our processing servers that the data is written as soon as there's two copies that are stored. It doesn't wait for the data to be distributed across the entire cluster. And those processing servers or the, there's a different tier that reads the data from there, can read data locally in either data center. Okay, so now you're using open source, open stack. What's that mean from a flexibility perspective? Yeah, it's new to us, right? But we love the fact that we can give back to the community as we run into any challenges and dig into the code. So you're from, you guys are, where are you located up in Microsoft country, right? Yeah, well, it's not just Microsoft country. It's Seattle. Yeah, there's Microsoft there and Expedia, but we've got AWS there too. That's true, that's true, okay. But yeah, it's a little bit more close to typically of a region. So it's something that we're not used to, but we're excited about the idea of being able to give back to the community. All right, the hardware. So yeah, this is the hardware design that we used. We, for us, it worked out really well with our business case that we're trying to solve here to have a very, very dense configuration. So that we use super micro servers that are rebranded and integrated from someone else so we don't have to assemble them. Silicone mechanics? Yes, silicone mechanics. And we use four terabyte hard drives in our configuration. You can see all the specs there, but the reason why this worked out really well for us is we have a huge amount of retention that we're having to maintain here. So we needed to scale the system for that, keeping the data for a very long time and scaling out enough nodes to handle our ingestion and our processing. What happens next year? Well, next year we just add more servers and if there are six terabyte hard drives next year, we can buy servers with six terabyte hard drives and they get integrated to the same cluster. We don't have to build out a new cluster. It's easy to grow. And if we find that we need more processing at the proxy tier, we just add more proxies. And if we want to move into a different region, we just add another data center and another region into the cluster. It's awesome. Drumroll on the TCO. So walk us through the math that you did to get to this because I'm sure your CFO was all over you on figuring this out. Yeah, yeah. Anytime I'm presenting a business case to management these days, I have to convert all of my financials to what they're seeing. And they're seeing cloud prices, right? They pull open CFO magazine or CTO magazine or whatever and it talks about dollars per gig per month. So that's how I presented this and calculated it. I tried to take everything into account. Not only the power, the space, the cooling, the management costs of the people doing the work, all of our licensing with Swiftstack and Ubuntu. And the other thing that I factored in here is that we have to store four copies to protect the data. So if I did the calculation with just one copy the price would be one quarter of what you see there but I built that into the structure. One thing I didn't have to calculate in this model is any cost from downloading files out of the system since it is my cloud, I don't have to charge myself for that. Whereas you do have to do that with public ops typically. Yeah. Okay, so then what does this mean for the future? What's the future of growth plans? Where is this all going from a business and architecture perspective? Sure. I don't mean to diss on cloud systems because we'd see a need for them in the future too, right? It didn't work out financially for this specific use case but there will be situations where we leverage cloud services and that's why we thought it was so important to get the encryption story right and that design right and we wanted to leverage an API call that is going to be or already is industry standard. So in the future we're looking at expanding to other regions by leveraging cloud providers to get even more locality to our customers and also just continuing to scale out our system to handle our increase in demand. We grow at about 30% year over year and that growth has been pretty consistent and steady for the last 10 years just up and to the right. I don't see it slowing down anytime soon and as more and more of our customers switch over from using faxes or our web-based application to mobile uploads we will continue to get more and more images into our system because when you have the fax or the web-based upload model you end up getting a much higher number of receipts in a single image whereas with a mobile workload every single image is a separate or every single yeah every single receipts a separate image for us. So yeah we have to be ready for the billions. We already have three quarter of a billion images in our system but it's gonna get much bigger fast. Cool. All right, thanks Dan. So the next section we'll talk about OpenStack Swift and global application and how it works. Yeah, questions? Yeah. So the question is what's the difference between Swift and Hadoop? Well Hadoop is generally more about data processing and Swift is more about unstructured data storage. So kind of different use cases. Yeah because I don't know. I think the Facebook is image and video and text put in the Hadoop and we can search and you also put the image in the Swift. So it's, I think it's a similar case study but there's OpenStack Savana that connect both of them. Oh yeah, yeah, so good. So I want to know how to work together and where Savana linking. That's why after lunch there is Savana. Yeah, there's lots of things that plug into Swift and certainly there's a module that can connect a Hadoop cluster into a Swift cluster and so that they can interrupt similar to how Amazon has their MapReduce functionality. So certainly that exists. There's other projects like ZeroVM which pushes compute jobs into the infrastructure but everything's optimized for its ideal use case. This use case with Concur, with storing images and unstructured data serving it out to mobile devices is really what Swift is built for and so that's the specific use case that we're talking about here. Not necessarily doing about data analytics or job processing. Sorry, can you talk a little bit over? Yeah, I will at the end. So we'll talk next about the, so the way we're gonna structure this, the second part of the talk here is talking about global clusters and how we got here and then what's next from a functionality perspective with Swift. Yeah? Why is Swift and not set? Well, that's a good question. So the, right, and of course I'm gonna have a biased point of view on this but that's the thing all we think about is object storage and serving out documents, serving images and we believe that by specializing we're going to be better because that's all, we don't even concern ourselves with block storage at all and structured data storage and it really goes back to how the origins of what Swift was designed for and that was is how do we do data placement and what are we trying to do and we're trying to store and serve unstructured data assets, okay? And the way that we do that is by using an eventual consistency model and by using eventual consistency that means that I can distribute data across lots of different places in an environment and if there's ever a sever in the network on any of those locations, I'm still okay because I can go to any and one of those other locations and go grab that data or even write that data into the system. So even with, so for example, so here's three, I should have put a grid around each one because they're different but there's a single node cluster and what we would do is we would store data uniquely across different drives in the system. So if you just have one node, you start scaling up to a small node, small cluster, you might have a few different storage nodes and then data would be placed across those wholly, wholly across those different nodes and if one of those nodes was to go down, I can still receive data into the system. There's no raid rebuild that happens, there's no locking global write lock that happens, I can still pull and get data out of that because I still have enough surviving nodes. It gets even better when you start scaling up and so you can have larger clusters where you start to have storage racks or rows in the data center and there, if there's network conductivity issues, you can still very much put data in and read data out of that system and it's incredibly tolerant of failures within the data center. So that got us to thinking, well, how do we be more tolerant, if we're that tolerant within a data center, how can we expand this to multiple data centers? How can we add the concept of regionality into Swift? We had the concept of what we called zones and oftentimes there would be racks in the data center but how could we have a latent link in between two different data centers, treat that a little bit special and build on top of this eventually consistent data model that we have so we can handle the use cases like we just heard about. So this is what we had. The typical model was a three replica model and configured with multiple zones. Then what we would do, so then we're like, okay, that's great, multiple zones. Let's add this concept of regionality on top of that and so what we did was we allow in the configuration to group together different zones and then data can be placed across those different zones. If there's a network sever between those two, you can still read and write into either one of those regions and then when the connection gets reestablished, there'll be a synchronization event that happens and the model is eventual consistency which means newest file will win and because applications that are already using Swift in a way know about the eventual consistency model, the model, it works and so you don't have the global locking issues that you might have with other systems that try to maintain file consistency across a broad geographic region and you want that file consistency if you're running an operating system on there because you need to lock a file but like in this case with Concur, you don't need that because every time you're uploading a new object, it gets a unique new file name and it's very tolerant of that model. So when we can do that, we become more tolerant of failures. We can utilize more capacity instead of having a DR site which is only used for that purpose and you actually don't send traffic to it. Now what we can do is we can have multiple data centers, multiple regions and we just utilize that capacity and so that means we can get a better user response for the application instead of having a site that's kind of only used for disaster recovery purposes. We just use that whole cluster and when there's all cylinders are firing, we get great response back out to the user. If one of the cylinders, so to speak, goes down, then well, okay, we have a little bit less power but we can still be responsive to the application. So here's how it works. So this little icon here is the proxy server. A write will come in, it'll distribute all the objects in that region first for what we call an affinity write. You can turn that on and off by the way and then data gets distributed across and then those files get dropped. That's kind of how it works and we did a little, yeah, video here, put on the main stage. This kind of animates it a walk through. So there's configuration, this is a view into our product on how you get that set up. So someone dragging that file onto the application. So let's say they're in one part of the world and there's a cluster say in Portland. So there's a GeoDNS which is part of the solution so I think that you had that set up with- Acrimi. With Acrimi? Yeah. Okay, and there's a few different other global GeoDNS services that you can get. The write happens, replication kicks in and once it gets copied over, then it checks and once it's confirmed, then they get freed up in the other location. If someone else, somewhere else in the world goes to request that file, then there's a request made. They get routed to a particular data center and they get something faster than they would if they had gone back to the other side of the planet. That's more or less how it's put together. I didn't talk about that when I talked about the TCO but it made a huge difference in the TCO calc to store four copies instead of six. Granted, it's not quite as good as erasure coding. We could drive even more cost savings into the system with that. We'll talk about that later. Yeah, yeah, for sure. But I think the big difference there is the user performance or the perceived user performance and I don't know if you could talk to that, how that factored into the decision as well. Sure, with Akamai, one thing we do have, which is nice, is we can leverage Akamai's edge network to get all of our file uploads immediately. They do the same thing. They synchronously upload to the edge and then asynchronously upload from Akamai to our storage system. But you can't optimize for downloads with Akamai and that's where the regionality comes in and giving with that global DNS the ability to route requests to a local data center to pull the data down makes a big difference for our end users. Cool. So from a right perspective, how Swift works is a client request will receive the proxy node, one of the proxy nodes will receive that request and then it will stream writes into the storage environment. There's no concept of a master file. Everything is equal weighted in the system and that's part of what gives it its scale capability. And then when the write happens, there's a concept of a quorum where a majority of the objects need to be written in the system and typical with one region we'll use three replicas and so in that example, it would be two of the three writes need to be successful for the write to be acknowledged as being in the system. And so that's how that works. And then from a read perspective, what the proxy server will do is it will select one of those replicas, go to that storage location, read it and then send it back up into the user. And so that's what the, there's a proxy service in this configuration which enables a lot how all of this works. And so what it'll do in a multi-region configuration is it will, the proxy has an understanding of where it is. So it knows what region it's in, it knows what zone in the data center that it's in, it knows what storage nodes it's nearby and so it can use that information to understand where to go first to go try to get some of the data. So if it looks up that object and sees that it happens to be in the rack that it's in, it'll just go to that local rack and retrieve the data and send it out. Additionally, it'll keep track of, and this is more automatic and if you don't have multi-region setup, it'll actually keep track of the latency to each of the storage nodes. And if it happens to be a closer node, it will choose between those two. So if it's not in that say local rack, but it knows that it's in that region and there's a rack over here and there's a rack way over there and that one is slightly further apart from a latency perspective. It will choose the closest storage location and keep track of that. Yeah, good question. The question was, does the proxy become the bottleneck? And the way that it works is that there is lots of proxies in the system and there are also a shared nothing component into the system. And we talked about using Akamai as a DNS, even then below that there's a load balancing tier which routes requests across lots of different proxy nodes. So if you do have a proxy node outage or you need more capacity in the proxy node, you can just add more into those systems. But I mean, we showed a picture of what those proxy nodes look like and there's two 10-giggy ports typically on those. Saturating those usually is not an issue in the environment. So from what we found. So it can it be a bottleneck? Absolutely, if you under provision it. Just like you can run out of storage if you don't provision enough storage disk. So if you do it over, so if you go to read, so it will also detect failure if there's a read. We're done. Do you wanna repeat the question? Oh yeah, sorry. So the question is if you do a write and then you go read immediately, is there an issue? Is that the question? Yeah, so there is strategies for dealing with that, both at the application level, if you need a slightly stronger consistency model to be more aggressive about finding the newest object. And but there's a very much an edge case which eventual consistency handles quite well. So one more point on this slide which is about global DNS and how it routes users. So that is something that's above the system and how Dan was describing how they use that. So and then there's pools of proxy nodes that get connected to as part of the system. Separately there's a WAN configuration which gets configured and you can separately QOS that. So that can be an independent part of the system which is nice because sometimes that's a more expensive network and you wanna be able to throttle that. And what's coming up next? There's two things left. One is now that we've added regions and there's folks like Dan that are running multiple data centers. Now the ability to add another data center doesn't necessarily mean you automatically wanna replicate all the data to that particular data center. So we're introducing the concept of storage policies. And what that means is that we can create a storage policy and lasso if you will, you can pick your choose your own adventure of to Mark Schilderworth's phrase here, number of regions that you have, number of hard drives you have, what regions you're available, number of replicas that you have. And what that allows you to do is add lots of flexibility into how you have your system configured. So you can continue to add different regions and create storage policies that can cover those. The question is how do those storage policies get applied? They get applied at the container level. And what that means is a user can have different containers in their namespace if you will. And then those containers can have different storage policies assigned to them. One of the next storage policies that we're implementing is erasure codes. And so we're doing this with a box Intel and that is going to allow us to, in addition to the different regions, also create a storage policy where you can have a single region with erasure codes underneath. So that works is underway for the ice house. And that is it. I think we're coming in just at the bottom of the hour. And so if you have any questions, I think Dan and I will be here upstage afterwards. Thank you for your attending. Thank you. Thank you.