 Okay, it's just about one o'clock, so I guess I'll get started. As you can probably tell from the title of this talk, it's about cloud storage. I'm going to be covering an API level design vulnerability in a few of the different cloud systems. So I want to do a quick introduction to that. My name is Zach. I'm a student at the University of Waterloo. Like many of you guys here, I've had an interest in computer security and applied security for a very long time. And this is my second DEF CON. And the first time I'm speaking at a DEF CON or any conference bigger than about 20 people. So hopefully I'll get that same response afterwards too. It remains to be seen. So I'm giving a talk on cloud storage and before this talk I was doing a little bit of recon on speaking to some of my friends trying to find out what it is that they use cloud storage for. And so a lot of them use it as a sort of a USB key replacement. They use it to share 10 megabyte files or larger with friends or they use it for backups of their documents or they use it for availability and accessibility beyond, you know, across several devices. Really, for the most part, it replaces USB keys. And a lot of them still treat cloud storage systems as, they're the same way they treat USB keys. They treat it as a large container that they just throw files into until they run out of space and then delete a few to free up a little bit of space afterwards. But one of the cool things about cloud storage systems is they've got many more features than just space providing. So I have a little chart here. I don't know if you can see it. But it speaks about some of the additional mechanisms that these cloud storage providers have like history or backup retention or things like that. And that's really what we're targeting with this. So the vulnerability, the main discussion I want to have with this is the idea that treating files as blocks filling up a larger box doesn't quite represent cloud storage when you have this time dimension. So if we try and reframe that previous picture with a storage space time graph, we can, as a Gantt chart really, when we're adding files, we have different time intervals that we're adding them. And then by removing files, we can see that the lifespan of these files stops existing after a certain amount of time. And then with this kind of representation, we can think about the amount of space we're using as sort of a sliding bar. So at any given time, we are occupying a different amount of space. So this gives us an interesting sort of mechanism with which we can recover previously deleted files. So really, what we're talking about is that a lot of these cloud systems have a size limitation for their quota management system but have a time duration system for their history backup retention. So when you have these two different independent quota management dimensions, you really have unlimited storage because you can exploit history retention to get additional amounts of space. So really, we're limited by our upload provider bandwidth rather than the upper limits we have with the existing cloud system, existing cloud parameters. So what this tool does is when we're doing an upload of a large file, we take a large file and we cut it up into several smaller fragments and load these fragments as different versions of some arbitrarily new file. And then we top it all off with a chunk of zero size. This way that our quota accounting mechanisms see this as a zero file. They actually see this as a zero size file despite having that history backup. So retrieval is very easy if we use this process. All we have to do is pull all the versions and glue them back together with cat. So going back to this storage time graph I was working with earlier, I used this to represent a file earlier but really what we can really treat it as is more like this where we have different versions of this file that together create that original file but are occupied considerably smaller amounts of space in existence. So our account use is actually closer to zero when we're looking at it from a different time. So it's a fairly easy idea. So I rolled it into a tool for you guys. I call this tool D-Pack Choppa, you know, running with this whole cloud environment thing. What it does is it chops up files and then packs them and then de-packs them afterwards. So it's a vertical storage management framework. What it does is I've created a Pugble storage framework that allows you to abstract out the API implementation specifics of individual cloud storage utilities. From this the tool also maintains a storage database back end for fragmentation or for maintaining the history of the fragmentation, maintaining the table of fragments, maintaining the initial file that form these fragments and also provides a combined line access tool to the core functionality of these individual components. So I can talk all day up here. But you guys really want to see a demo, right? All right. So yeah. I'm getting a little bit of resolution problems there. Is that better? Okay. So what I'm starting here is I don't have anything in this directory. Just showing you that there's nothing up my sleeves. I'm creating a 64 megabyte file that I'm going to upload to the service. Here's the check some of it, just saving that behind. And then let's upload it. One of the things I'm doing here as a sort of a way to, you know, one of the things I'm trying to demonstrate here is that there are ways of circumventing existing detection mechanisms for this kind of a thing. So what I'm doing here, and you can see this here, is that the file size for the individual fragments is around about 512K, plus or minus 5%. It's a normal distribution. Try and get around any sort of mechanisms in place to detect continual overwrites of the same thing. Now, I'll get into this a little bit later. There's a bunch of different techniques you can use to mask that we're doing this. But for now, this is, you know, demonstrated fairly well. This information is generated by the D-PAC tool itself. It's showing you the individual chunks that belong to this file, as well as the file size per upload. I'm going to use this to compare later on when I've got the information I'm getting back from the server. This is all locally generated information. So we're just about finished. Yeah, you can see the second last file there is about 200K, just to top it all off. And then the last one is zero size. And you can see I've gone back into this folder. This checks I'm going to use here. The checks I'm going to use here to act as the handle on the existing framework takes up zero size. So back to where we were. Now I deleted that binary and I'm busy reconstructing the file from the fragments and getting back from the server. So these chunk numbers you see here are the server, is the information provided by the REST API that gives us the mapping to those individual chunks we're looking at earlier. If you compare this list with the list we had earlier, we'll see a one-to-one mapping of the file size of getting back here and file sizes we sent. Yeah, this is specific to Dropbox in this example. But there's no reason it can't be extended to other cloud storage providers. So I finished downloading it. It exists there and we can see that the checksum's match. So we can actually use this for storage. The tool in the form that I used there is available on the CDUs you guys are getting as part of the packages here. But I'll also have the updated version of the code on GitHub at this link. You can bug me for it afterwards. And what I like about this toolkit and one of the reasons I wrote it in Python is to give it the extensibility for hiding from these detection mechanisms. So for example, we can maintain our own deltas to map to real changes in the file size of file information rather than our faking it through the API here. We can also do a sort of adaptive mangling, use different file names. Right now this tool just uploads with the git hash and uses that as the anchor point in the cloud storage system. But there's no reason we have to use that. So the future work I want to cover is extending the CLI. Right now it just supports get and put. But you know, it's fairly simple functionality to be working on there. I also want to get some more modules done. I looked at some other cloud storage providers, just two or three, that have some mechanisms placed to defeat this but aren't particularly rigorous themselves. So really only Dropbox works at this stage, but we can work on that, right guys? I also want to do some more tunable options so that we can look at different ways of automating the process of generating the file fragments. In this case I used a generator to generate 512K chunks with a normal distribution but there's no reason we can't move across a bunch of different things. I had to overwrite one file but there's no reason we can't move to multiple files. There's a whole bunch of different ways we can take this depending on any sort of tunable objects we want to use. So this wouldn't be a security talk without the implications of this kind of vulnerability. So if we look at the blue team concerns for this, it's fairly straightforward to detect this by looking at the constant file size writing and the time you're starting and the difference between the delta uploads. But we can deal with this with generators by introducing subtle variations in the delay of the uploads of the different versions of these files. We can also vary the name, we can also vary the file size. And that's something we can counteract their initial response to this thing. Secondly, it's fairly straightforward to ban an API key but again with the sensibility request a new one. There's not going to limit the available tools we can create just because of one or two bad eggs. Secondly, thirdly, the one thing that is fairly evident is the null caps. Those zero size fragments that are right at the end of the files that make them take up no space in the internal metrics, they kind of, that's a fairly obvious signature. So we can really replace that by using something very small like a one byte file. Which again by moving to one byte we don't, they have unlimited space anymore but with two gigabyte storage we can still store two billion files like this. One of the reasons this is of major concern to these companies is the fact that having unlimited space really undermines their business model. They have this whole drug dealer, the first bits free kind of thing. And that getting unlimited storage really breaks their financial incentive for these kind of things. Secondly, by going the opposite way, if they break large binary writes it will really, it will really damage a lot of the existing tools that use Dropbox already or any cloud storage system already. For example, I use NKFS into Dropbox and that does a lot of binary modifications again and again and that will probably trigger very similar to the DPAC tool. Finally, I know that we've discussed this several times at various talks about Prism and everything but deep file analysis is really time consuming and frowned upon. But really it's more time consuming than it is problematic for themselves. So that's something that we can use to get around that. So I got through everything I wanted to say in about 11 minutes. So I just wanted to do some special thanks to some of my friends who helped me get to the stage who encouraged me to do this. And yeah, that's all I have to say. Enjoy your lunches. You're still speaking. No, I'm not. Yes, you are. Oh yeah, this is a fun conference. I forgot about that up here. What do we call this? Shot the noob. Thank you. Why isn't we doing shot the noob? First time speaker. What else do we need? There, right there. Someone's first time at DEF CON. First time at DEF CON, sir. All right. All right, come on up. She was sitting next to him. So is this your girlfriend? Wife, all right. Congratulations. All right, here we go. It is very hard to be chosen to speak at DEF CON. Very competitive. So big round of applause for our first time speaker. Now you can say you're done. Okay, I'm done. Thank you.