 Okay. Good morning, everyone. Hi. Thanks for coming along to this talk. We're going to be telling you about a new feature that's been added to Swift in the latest release cycle, which is the ability to encrypt objects. I'm Alistair Coles. I'm with Hewlett-Packard Enterprise, and I'm also a core reviewer on the Swift project. And I'm Tim Burke. I'm a software developer at Swiftstack, and I am also a core reviewer. Okay. So my slides seem to be auto-advancing for some reason. I'm just going to pull out this clicker. So Swift object encryption. We're really excited to be able to tell you about this this morning. At the last summit in Austin, myself and Janie Richling gave you a preview of this work. Since then, we have released the first version, and it's available with the Newton release. So here's what we're going to cover this morning. We're going to talk a little bit about why we have added this feature. We're going to tell you a little bit about how it works and show you a demo. And then Tim has some performance results that he's going to share with you. But before we get into that detail, particularly for those who may not be familiar with Swift, we're going to give you a very brief overview of what it is and what the Swift service is composed of. So Swift is an object store. It's ideal for storing blobs of unstructured data, like photographs, movie clips, music, even virtual machine images. Swift has a REST API. It's accessed using HTTP. And that API offers a simple set of what we call CRUD operations. So you can create your objects using puts, read them back. You can update your object with a post request and delete them through this API. Swift is not a hierarchical file system. Swift has a very shallow and fixed naming hierarchy. So objects are grouped into containers, and those containers are owned by accounts. And an account in Swift has a one-to-one relationship with a tenant or project in the context of OpenStack. So there's a number of clients you can use to interact with Swift. This just shows an example of using a simple curl command line. But there are clients with a number of language bindings for Swift. Okay, so one feature of Swift is that it is highly scalable. And, in fact, in a series of previous summits, we've heard talks describing production deployments of Swift that have scaled to tens of petabytes, storing billions of objects. How is that scalability achieved? Well, any object that's put into Swift service is first handled by a proxy server, a front-end proxy server. It ends up being stored in a pool of back-end storage nodes. Now, the proxy server uses a scheme known as modified consistent hashing to choose one of those storage nodes on which each object will be stored. I don't really have time to get into the detail of modified consistent hashing. If you want to know more about how that works, please come and talk to myself or Tim afterwards. We'd love to explain that to you. But the bottom line is that objects are distributed evenly across that pool of storage nodes. And that means that load is distributed evenly across the nodes. And that aids in scaling out of Swift. The way in which the proxy server chooses a particular storage node to map an object to is to use a hash of the object's name. Now, that's a deterministic operation. So what that means is that the proxy server isn't needing to do a huge amount of lookup to find where any particular object should be stored or indeed to find an object that the client wants to read back. And it also means that proxy servers don't need to be communicating states between themselves. So that's another design choice in Swift that enables it to scale. As well as being scalable, Swift also offers a very durable storage service. So in fact, every object that is put to Swift is actually stored across multiple backend storage nodes. Now, that's achieved either using a technique known as rager coding or using a simple replication strategy. And as illustrated on the slide here, that means and typically that means that Swift would store three copies of the object, each of them on a different storage node. So this means that your data is very durable. It also means that your data is highly available. Swift is able to continue to serve read requests for objects, even in the face of disk failures or storage node failures or maybe network failure or partition. So Swift is scalable, durable, highly available. And with what we're describing this morning, we're making Swift more secure as well. But I emphasize the word more there because Swift has not been an insecure service prior to us adding encryption. You may be aware of the Keystone identity service, one of the open stack projects. Swift takes advantage of Keystone to authenticate every request that arrives at its API and then authorize the operations that that request will make on objects. And also just looking again at that split architecture of the Swift cluster. Every request is handled by proxy servers. So that means that the much larger pool of storage nodes have are completely isolated from any public facing network interface. So Swift hasn't been insecure, but we've been working hard to add an additional security feature and Tim's now going to tell you some more about that. So you might wonder why we needed to implement this when Swift is not insecure. The important thing to keep in mind is that the public facing API is just the most obvious way that data can leave your cluster. Ask any operator and they'll tell you that that data need to be recorded on physical media and those drives are prone to failure. Alternatively, perhaps your cluster is growing more slowly than you originally anticipated and you need to reallocate some drives to be used for more productive purposes. In either case, you have undoubtedly established policies and procedures to maintain the confidentiality of your data by ensuring that decommissioned hard drives are securely wiped or if that's no longer possible destroyed. This has two downsides. First, these policies and procedures must be carried out by people who make mistakes. Steps may be forgotten or hard drives that were thought dead misplaced. However, in the face of forensic techniques, there may still be recoverable data on these dead disks. Second, this prevents you from being able to return drives that have failed within their warranty period. Since the drive has failed, you can't wipe it and since you can't wipe it, you can't feel secure in sending it off to a third party. By encrypting the data before it ever lands on a disk, we can solve both these problems. Now, considering Swift's architecture, the natural point at which to do this encryption is in the proxy server. There, we can encrypt all data on ingest before it's ever distributed to the object nodes. And by decrypting data on the way back out, this can all be transparent to the end user. Keep in mind this is an operator feature. Additionally, this provides a natural point at which to integrate with external key management services such as Barbican. This is very similar to how we already integrate with Keystone or Solometer. To do the encryption, we use an industry standard 256-bit AES cipher in counter mode. This allows us to maintain all of Swift's existing API contracts. And because implementing your own cryptography is rarely a good idea, we use the already vetted cryptography Python library. So for this initial release, we have primarily focused on object data and its metadata. This far and away comprises the majority of disk space used in a cluster. There is user-settable metadata that can be stored alongside the object. And since that may also contain identifying information, we want to make sure that that is encrypted as well. When encrypting the data, we want to use a unique key for every object. However, storing all of these keys in an external service would be unscalable. Instead, we use a root secret for the entire cluster, and from that derive keys for every object. This key is used directly to encrypt the ETag and user metadata while we generate a separate random 256-bit key for the object data, which we then encrypt that key and store it as additional metadata. This is to allow us as future work to re-key data without needing to re-encrypt the entire object body, which is many orders of magnitude larger than the metadata. Now, it's important to keep in mind that in this initial implementation, the root secret is stored in config files on each proxy server. This is somewhat problematic, as you need to now treat these drives as being confidential. However, there is work in progress by Matthias at IBM to integrate with Barbican, the OpenStack key management service. All of your Swift drives will be RMA-able. In fact, Alistair will even use this in-progress patch for his demonstration. Thanks, Tim. We're going to show you this feature at work, we hope, in a demo. Before I get into the demo, just talk a little bit more about what Tim just referred to there. We're going to be using this work in progress from Matthias, who's actually here. Thanks, Matthias, to have our encryption root secret persisted in the Barbican service. Barbican is good at looking after secrets. This means that Swift doesn't need to be concerned with looking after our root secret. Before we can do that, we need to actually get the secret into Barbican and set up some credentials that can be used for the Swift proxy to retrieve it. Before we get up on the stage today, I've been through a process such as this. I've created a user and a project in Keystone, which I'm going to use to be the holder, the owner of my root secret. If you are familiar with deploying OpenStack, you'll know that services such as Swift, Nova, and so on already have the concept of a user. They have a user in Keystone that has an admin role on a project in Keystone, I think is known as the services project. I'm not using that project to store my root secret because what I want to avoid is any other service becoming compromised, and then by virtue of it having admin role on that project being able to read my encryption root secret. We're using a completely separate project that will be used to store or within which we will store the encryption root secret. So I set that up in Keystone and I present the credentials for that user and project to Keystone, and it gives me back a token, and that token then authenticates me and authorizes me to put an encryption root secret into the Barbican secret store. So I've already done that before I got here today. Now I configure my Swift proxy server with the same credentials, not the root secret but just the credentials that will enable the proxy server to in turn query Barbican and retrieve the root secret dynamically. This is very similar to the way in which the Swift proxy in fact many services have credentials that they can use to authenticate with Keystone so that they can validate user's tokens with Keystone as they handle API requests. Very similar to that. So when the proxy server starts, it presents those credentials to Keystone, it gets the token back just like I did myself when I was configuring this, and it uses that token to authenticate with Barbican and retrieve the encryption root secret which is only ever held in memory in the Swift proxy server. It's never stored within Swift. One nice thing about this is it means that we can rotate those credentials and just change them from time to time to guard against perhaps them leaking or getting lost. The secret itself is only ever-possisted inside Barbican. Okay right, now I actually have to prove to you that this works. So my demo, I'm actually running all of those services in a virtual machine on my laptop. First of all because that just means it's more likely to actually work here in this possibly disconnected world. But actually that's a really helpful thing to do because it means that the storage nodes in my Swift cluster are actually writing data to branches of the file system in that virtual machine which means that I can actually go and find my data at rest and show you that it's been encrypted during the demo. So the first thing I'm actually doing here is just, I've just gone to Keystone and presented some user credentials for my client to make request to the Swift API. And I got back an endpoint for my Swift cluster and I got back a Keystone token. So throughout this demo when I make requests to the Swift API I will be authenticating using that auth token. And that means that I expect to get back my data. I'm authorized to read that data. And as Tim said this is a feature that is transparent to the client, it's server-side encryption. And because I want to kind of show you a before and after effect initially in the demo I'm going to disable this encryption feature. We're not going to have it installed in my cluster. So allow me to just do that. So I actually have my Swift configuration under version control. So all I'm going to be doing is just checking out different versions of my proxy server configuration. So initially this diff is showing you that I'm actually taking out the Swift encryption middleware components from the proxy middleware pipeline. Now do that. And I need to restart the proxy so it's configured in this way. So first up I'm going to create a container into which we're going to put an object. I'm using curl. And there you'll see I present my auth token with the curl command and I'm going to create a container that just has the name c. And thankfully that worked. So now I'm going to put an object into that container. Gain using curl. Gain using that auth token. My object name will be O, sorry, not very imaginative. The object I'm putting is going to be read from a local file, secret.txt. So this is my data. This is my user's data. There always seem to be important software updates. Okay, so that succeeded. And I'm just going to go and get the object back via the API. Okay, so with my auth token, get this object back. There it is. Now for some of us that might already appear to be encrypted, but if you have any familiarity with Spanish, saying welcome to Barcelona. So that's the content that we actually want to encrypt. Before we do so, let's take a look at what happened to that object in my file system. So I actually created, as I said, I mentioned earlier that a typical way in which Swift is deployed is to store three replicas of every object. And I'm using that kind of storage policy here. So you see that there's three places in my file system where the cluster has written a copy of that object in a file with the dot data extension. Let's just choose one of those files and take a look inside. And there's my object content. There's the plain text of my object. Okay, so another feature of the Swift API is that you can make a request to the container URL and get back a listing of all the objects that are in the container. So I'm going to do that now. And because I've only put one object in this container, it will be a very short listing. The reason I did that is I want to show you that you don't just get back the name of the object. You actually get back some metadata about the object from the container service. And that metadata, as well as having the name and the size of the object and its modification time, it also has this piece of information, which is the hash. It's the MD5 sum of the object content. So this is one of those pieces of metadata that Tim referred to that we want to encrypt as well as the content of the object. And particularly in this case, because the MD5 sum of the content is actually revealing some information about that content. Now, the problem, well not the problem, because we fix this, okay, it's implemented, but the challenge we had is that this metadata is actually stored in container databases separately from the object. So there's multiple places in the Swift cluster where we've needed to go and ensure that data and metadata are encrypted. Okay, so those container databases, just like objects, they're also triple replicated. They're durable, like the objects. So there they are in my file system. I'm going to go find one of them. And I'm going to run an SQL query on that container database and read out the particular row that's describing this object. And that's what I get back. And there you can see that MD5 sum value is stored in this container database. It's actually, it has a different label in the database. It's called the ETag. So Swift uses that MD5 sum as the entity's tag value for all of its objects. So next I'm going to enable encryption. And I'm going to come back and I'm going to show you that not only is the object content encrypted, but also this metadata, which is held separately in the container database, has been encrypted too. Because we don't want that stuff leaking out, even the MD5 sums leaking out of our cluster. Okay, so as I said, I just need to check out a different version of my Swift config. In this case, the diff is showing you that I'm inserting the encryption middleware. I have a key master and an encryption piece. And importantly, remember to restart my proxy. So we now have encryption installed. And I'm just going to run through that set of operations again. So first of all, I put the same object into the same container. And now, again, I'm just going to found one of those data files that's now storing this object at rest. And let's take a look inside. Okay, and that definitely is ciphertext this time. Okay, that's the encrypted version of my client's data. Importantly, and just emphasizing this game, an authenticated request to the API from my client will still read back the plain text version of the object. This feature is transparent to clients. Thankfully, that works. So there I have the plain text coming back through the API. But what we really care about, of course, is also that piece of metadata that I mentioned in the container databases. So first of all, again, going through the API and listening to container, I get back the plain text metadata. There's the same hash value. But I'm really hoping that that hasn't been stored as plain text inside my cluster. So let's go and find the container database again, run the same SQL query on it. And now you'll see that the row in the container database, the record for this object, that ETag value has been encrypted. And there is its ciphertext. There's actually a whole lot of junk also in that row. That's other metadata to do with the encryption process itself. So we have metadata of metadata written to this database. And that's the end of my demo. I hope you're convinced that we are encrypting this stuff on disk. Tim? So I promised performance numbers. This all sounds great. But you would expect that because we're doing this added work of encryption, there must be some performance penalty. Before I jump into benchmarks, though, I'd like to give a little overview of the setup that I used. I just borrowed an arbitrary small cluster that we have in a lab set up at Swiftstack. This had four dedicated object servers and a pair of proxy account and container servers. Since all of the work is being done in the proxy, it's worth pointing out that these have dedicated internal and external interfaces and four two-year-old Xeon processors, which notably support AESNI, which is some hardware accelerated encryption. This wasn't actually an intentional choice on my part. This just happened to be what was available. It's five-year-old technology. I suppose it's kind of standard now. On client, I just wanted some quick numbers. I just used a single node with 50 concurrent worker threads to drive some load. Let's look at the baseline. This is a standard performance for Swift, the characteristics of... There are two main things I like to look at for performance in Swift. One is the requests per second, which you can see stays relatively constant for small object sizes, then drops off as the objects get larger. If we switch to average throughput, you can see why. We have saturated our nicks. It might seem curious that puts have such noticeably worse performance than gets, but keep in mind that this is a triple replica policy. While our external interface still has room to spare, we have saturated the internal interface. If we look at an erasure-coded policy, we see similar general trends. There's noticeably worse performance for small objects, but the get performance for large objects is comfortable. On puts, we can actually get better performance since we're writing less data into the cluster. That's our baseline. What does it look like when we turn on encryption? If you can't tell the difference, neither can I. There are actually four distinct lines there, and we still have the same bottlenecks. For small objects, we are constrained by the connections, parsing HTTP, and for large objects, we end up just saturating our nicks. And looking at erasure-coded policies, we see the same basic trend. There was perhaps some slight degradation, but your bottlenecks are going to be the same. So, Swift Object Encryption, it's a thing. I love it. We should totally do that. It's been a year worth of work from a dozen different contributors, even more when you consider the design feedback, and yeah, it's great. We recommend that you use this for new clusters, primarily. We did design a way that we designed an upgrade path for existing clusters, but in order to ensure that all of your data has been encrypted, we really recommend that you do this on new clusters only. And finally, so long as your hardware supports accelerated encryption, it seems like there is little performance penalty for turning it on. Yeah, Tim mentioned that this has been an effort by a team of people. Swift has a fantastic upstream developer community, and a lot of these people are actually in the room, so well done to them. I just want to acknowledge the input we've had from all of these folks in getting this feature into Newton. And we'd be happy to take any questions. There's a microphone on the side over here. If you could please use the microphone for questions, because the session is being recorded. That would be great. Could you use the microphone, please? Thank you. Sorry, it's a long walk. So you said the encryption is done at the proxy level, so why it matters to require the hardware AES? So the proxy servers still perform the encryption themselves. That's where having the AES instructions really helps, I think. So you don't have any of the hardware or the object nodes? Correct. Yes. So how do we enable our encryption from client using REST APIs? Is there a metadata that we support? Now, right now, this is a feature that's completely under the control of the operator. At server side, as we said, it's transparent to clients. So during our development work, we explored offering an API feature for clients to enable and disable encryption, and in fact, also for clients to provide their own keys. And in the talk we did at Lassum in Austin, we actually described that work in more detail. But we haven't yet merged that work and released it as part of SWIFT. What we wanted to do was get the minimum usable feature set out so people could start playing with it. And then this works and then progress. And I would still maintain that if you as a client want to make sure that your data has been encrypted, it would be best to do that on the client before sending it off to SWIFT. And my second question was in that similar lens. In case a client wants to do it on the wire, before it sends the data on the wire, can we still use Barbican as a KMS? So long as your client knows how to talk to Barbican, I don't see why not. At that point, it's out of SWIFT's control. And is that under APIs right now? That's not something that's supported by SWIFT API right now. Yeah, I guess clients can talk to Barbican. And I think there's a whole raft of different models that could be explored. For example, your client using a Barbican-hosted secret to encrypt data that it sends to an object store and then delegate for the object store to decrypt that data by giving object store credentials. But as I said, basically we wanted to get the first minimum simplest working feature out at this point. And we think we have something that is useful with the set of features that's there. Sounds good. Makes sense. Thanks. Hello, I have two questions about performance. First one, is there any impact on latency or first-by delay? Did you measure some? I neglected to look at that. I would have to check. Find me afterward. I can make sure to get you that. And you said that there is very little impact if we're using the correct CPU. What is this impact? On the order of percent or two, decrease in throughput. Okay. Thank you. My question is related to can the operator choose the encryption decryption at the object level. Certain objects need to be encrypted and others not. Is that something currently supported? That is not. This is a cluster-wide setting. This allows you to be confident in being able to RNA drives or that sort of thing. Really, an operator wouldn't know much about what data is stored or how sensitive it should be. My other question is related to this root, the centrality which is coming from the Barbican set in times that key may get compromised or certain organization may have a policy to change that periodically. Is that something supported? Not yet. We have intentionally left room for us to be able to implement that in the future. But currently, don't lose it. Okay. Thank you. Is that a key rotation? Yes. Just to mention again, one of the things Tim mentioned during the talk, what we have done is we've made sure that all of the actual object content is encrypted using a randomly chosen key that then in turn is encrypted using our derived key. So although we don't yet support key rotation, we can do credential rotation but not key rotation, should we in the future do that? We don't have to re-encrypt the content of every object, all of those petabytes of data that we may have stored. All we need to do is actually re-encrypt the randomly chosen key for each object to achieve key rotation. And so we deliberately put that one feature in there with a view to adding key rotation in the future. We probably have time for one more question if anyone has one. I'm not familiar with the Python crypto library. So does the strength of encryption depend on that, like FIPS compliance and things like that? Like how strong the strength of the keys? Yeah. So the strength of the keys isn't dependent upon the library exactly, but I would want a vetted implementation such as the Python cryptography module to make sure that there wasn't some side channel attack. And the key message there is that storage engineers have not gone off and implemented a cryptography library here. We've used a cryptography library that's been implemented by cryptography experts. In fact, one of the authors of that library gave us some review time on what we're doing here. So we're just confident that by using a standard library that's written by crypto experts, that we're doing the right thing. Do you happen to know that FIPS compliance? I'm not sure. I'll hand. I'm sorry. John may know. I do know the answer to that question. So the FIPS compliance stuff for those who don't know is something about the U.S. government standards about how to store data at rest encryption. We are using open SSL, which can be compiled in a FIPS compliant mode. So I would say that it's kind of tricky to talk about a FIPS compliant open source project. But like if you're building a product around Swift, you can certainly pursue that and all of the knobs and switches are available for you to do to get that sort of compliance. So it's kind of like ready, but nobody has gone through that process yet with U.S. governments and taken the time and money to do that. So we're using the external libraries that are available to be FIPS compliant. So like Alistair and Tim were talking about, we're using the third party modules that have been vetted by crypto experts using stuff like open SSL that can be used for FIPS compliance, but Swift itself is not FIPS certified. And to this point, my knowledge is that there is no product built on Swift that is yet. That's also because it's brand new and there's other people including Swift stock who are investigating stuff like that. And it is worth noting that 256-bit AES encryption should be sufficient for FIPS compliance. That algorithm is good. The only question is the implementation. Great. Thanks for that, John. Okay, we're out of time. Thank you again for coming along and listening.