 So this is going to be about encrypted cloud storage as you might guess from the title and specifically about a security review an independent security review we did of this spider oak one application and what we found so the basic goal will be to to see you know what do they claim to provide and what do they actually provide in terms of security. So this will be basically what I'm going to be talking about so I'll start by describing the threat model we used for this review then I'll describe at a very high level how this spider oak application works and then I'll show some attacks on how this can the security can be broken and as a small disclaimer here we did report all our issues and if you're using spider oak then you should just use the newest version I guess that holds for all software all right so I think I'll start by comparing you know or trying to motivate encrypted cloud storage that might seem I don't know silly at a conference full of cryptographers but I'll try anyway so traditional cloud storage I think has some privacy concerns for example you know besides ourselves the user who else can read our files if our files are basically stored in plain text on the server then anyone with access to the server can essentially read these files and that's probably a problem there's also this issue of you know what happens if we when we delete files are they actually securely deleted and the same holds when we close our account so you know the files that we had at some point are they totally gone now from the server or is there still some you know liftovers basically and that's not so cool maybe and finally you know what happens if this cloud storage gets so cloud storage company gets sold so it might be that we are fine with the company as it is now but if it's suddenly sold to somebody else who has a different approach to managing user data and so on then we might not be too happy about this but essentially we have no real control over the files that we store in this service so we cannot really control it after you know it's sold and the solution I guess is obvious right we just encrypt everything on the client before we send it to the server and hopefully you know all the cloud or all the server sees is encrypted files and he shouldn't be able to use that for anything really all right so but this this like begs the question you know that this encrypted cloud storage easiest it should give something more in terms of security than regular cloud storage in particular I think we want our files to stay secure even if the server turns malicious so all of the points I had before can basically be considered as you know some kind of malicious server the server starts looking at our files or poking around in them and and so on and it also seems that real products basically agree with this sentiment that you know we do not necessarily have to trust the server so I've taken all of these quotes from the various websites and they all basically seem to be saying the same that you know you should not necessarily have to trust the server or your files stay secure no matter what and and so on and it's the button one will be focusing on in this talk okay the question is of course you know is this malicious server threat model thing is that actually used so for this spider case after we disclose these issues to spider they wrote this blog post basically stating that it was built in a time before they considered this threat model and therefore it is not secure against this malicious server so that's too bad but I guess that you know spoils the spoils the results that we actually did found find some some issues in this if it's not secure per their claim finally there has been some previous work on on looking at these cloud storage company or cloud storage in cryptic cloud storage services and the kind of security they provide so I've taken two examples here that consider spy dope in particular and you know one of them the top one considers an external adversary in in the form of some kind of web adversary and the bottom one considers only file sharing and shows that the server can actually read the file so when you share a file with somebody else in this application you lose this guarantee of security basically and this file sharing I'll get back to because it's a bit worse than this all right so so our threat model is is pretty simple so we assume an honest client for I think obvious reasons otherwise it's not so no so fun to look at and since we are talking in the context of you know a real application that we want to assess the security of this simply means that we receive this software from the server before the server turns evil and it's kind of necessary because of course if we start assuming or start you know looking at what happens with the server turns evil and obvious questions of course but the server you know it could just upload a broken client so we don't consider that because then security would be I don't know hard to analyze so informally our threat model looks at basically two questions so first of all are we secure against some kind of passive adversary so if all the server does is look at the files we have we are storing but otherwise does not do anything out of the ordinary are we then secure so I guess this this should be pretty clear that this is should this should at least hold right because otherwise there's no point in doing all of this then we're no better than a regular cloud storage solution the other thing we looked at is what happens if the server suddenly starts deviating in some some sense so of course this cloud storage or these cloud storage applications are in in essence some you know client server protocol so of course there's a lot of communication between the client and server and a question is of course you know what could the server do actually that is it would not normally do and how does the client react to this so specifically you know the protocols that it has and can these be misused and again since we're considering considering a real application that's also the question of the client's implementation finally we do provide this in a bit more formal way in our paper which can be viewed it's on e-print and it's basically just an in this thing is your ability experiment between a and yeah client and a server which I think most of you have seen such kinds of our definitions possibly a hundred times or something like that right and our definition only considers confidentiality so it's really the very minimal I think that we would want from one of these products okay so spider one some very basic facts about this application so it's an encrypted cloud storage application and it has received some praise or endorsements from both Edward Snowden and the EFF so if you care about these authorities in in popular privacy or what have you then this this is a nice attribute to have I think it uses this interesting name no knowledge for their encryption routines and before that they use zero knowledge and it's important here to note that this zero knowledge has nothing to do with cryptographic zero knowledge is more like saying that we have zero knowledge of what you're storing because we're encrypting data and that's also what actually led us to look at or consider spider oak in the first case because we thought it was very interesting that they use this name without really you know considering that it actually has a proper definition okay it supports all the media operating systems nothing fancy there there's some partial support for android and iOS it supports file sharing it's written in python so it's easy to decompile that was nice it uses certificate pinning and tls which means that the attacks that we do find cannot actually be run by somebody outside because an outside attacker would of course have to break this tls thing to run the attacks we find of course that's easy for the server because the server is the intended recipient of the client's data and we did our review on this 615 which was the newest release or newest version last last year when we did this all right communication very roughly looks like this so it proceeds in two parts there's an authentication part which is only run on install so when you install the application the first time either as part of a new account or just installing on a new computer the server gets to pick which protocol to run uh for some reason there's four protocols that the client will run but it only runs two of them um and so the the intro or the important thing here is that the server actually chooses this so of course a question i'll come back to is you know what happens if we run run one of these protocols that is not actually run and of course this being a proper or i shouldn't say that uh this being uh you know i i don't know if it's uh uh yeah all of the protocols are not standard basically homemade so this is also interesting i think um whether or not that's a good idea is of course up to debate i think um everything else basically is done using rpc calls in a python library and this is surprisingly comprehensive for an application that basically just uploads and downloads data so there's around 90 different procedures that the server can actually call on the client that's of course also very interesting to analyze encryption in the application is done like this so if you store a file f this is uh means that you would derive a key in this way by hashing f together with this mk which is a random string these kf is or kf's keys so file keys are encrypted with a per directory key so you have a um you have a unique key per file and then to protect all of these file keys you use a key per directory all of these directory keys is then encrypted by some long-term key so a long-term key is just a key that is created when you create the account and then it's just the same forever and finally these long-term keys are encrypted by uh a key derived from the user's password so of course everything in the end depends on the user's password for security and then it gradually i guess fans out as you go down this tree on the right a small note here about password changes so if you wanted to get some sort of future secrecy you know if you break one of these keys then you'd have to rotate it at some point when you change your password unfortunately this does not happen so if you change the password in the application only the encryptions of these long-term keys are actually recomputed but the keys themselves are not changed or anything so if you know these long-term keys then changing the password does not you know give you some sort of future secrecy here all right i'm yeah right so this is basically what we found so we found four different issues that we could actually use to attack the client and what this means is that there was some issues or we found some issues in the in the client that the server in some ways can trigger and this degrade security in some uh definable or miserable way so we found one uh issue the first one here which basically weakens the security of a hash drive from the user's password so one of these authentication protocols i talked about before is basically just asking or sending some parameters to the user and asking him to derive a hash of course you know if the server is malicious he can choose these parameters in such a way that this goes wrong or this degrade security there's also some some interesting consequences of using a pretty out-of-date python library in this but i'll skip this part then we found two attacks that can recover the user's password and one of them is totally without the user actually knowing what's going on this is of course pretty bad i think and then we found one attack which in some situations will recover files that are actually not shared so this the last one here means that in some situations where you share a concrete directory you would actually end up revealing files that are not part of the directory to the server also which is interesting to get back to this active passive thing uh we have in our thread model uh the two password recoveries are active so they require the server to do something it shouldn't do um the passive one is basically just this all the server has to do is inspect the data that is sent and of course we implemented all of these and and verified that they actually worked and so on so the first password recovery so remember that i mentioned we had these uh four protocols but we actually only saw two of them so one of them looks like this so the server will send a list of rsa public keys and some chl random string the user uses something called rfc 1751 to computer fingerprint and what this does is that or it hashes all of these keys first and then it uses this rfc to computer fingerprint this rfc is an rfc from i think 94 that describes a way to turn a list of or a string of bits into english words okay then it displays this fingerprint to the user so the user gets involved in some uh non-trivial way like besides just inputting the password and supposing the user accepts then you compute this layered encryption thing of the user's password and this challenge uh which looks interesting of course the issue is pretty obvious here you know the server can of course be the one that picks these these keys in the first place and this will let him decrypt uh the user's password when he sends it back although this leaves out the question of this fingerprint so of course this fingerprint the way it's used here it should be pretty obvious that it's basically some kind of out of band authentication for uh this list and i think this is used in their enterprise product uh for escrowing keys but it can also be used in the single user product which is interesting and the question is of course you know what should you the the user be comparing this to this this fingerprint to so this this is uh an i so this is the question right if he hasn't ever seen this fingerprint before which i think is a valid assumption if we run this against the user that is not supposed to uh or against the client that is not supposed to run this protocol then he hasn't seen a fingerprint before and of course you know what should he do so spydo takes this trust on first use tofu approach for this and this is the message that it is displayed together with this fingerprint it basically says that if you if you have not been giving this fingerprint then you should just click yes and move on of course this doesn't work in in the case where you know you're a malicious server so this this authentication does nothing to again or in in this active case i would argue right so the file recovery so it requires some basic or some quick observations about how uh directory sharing happens in this application so remember we had this tree or hierarchy of keys of course the most efficient way to share a directory when you have this structure is to just reveal this directory key and that's what spydo does so by simply revealing basically one uh symmetric key you can share gigabytes of data so this is very efficient unfortunately we found that when you move files uh between directories no encryptions are updated and we also found that these directory keys are not updated this of course leads to two scenarios that i think are very plausible so first of all suppose we have some directory with some files this could be easily be like a directory with the vacation photos or something some of them are i don't know maybe too private to share with your colleagues or something so you move them to a different directory and then you share the old ones so now you can show these photos but from this observation to this file that you move away is actually still shared with the old key so when you reveal this to the server as you do when you share this the server can actually recover these different keys a similar thing happens as a consequence of this observation three if you have some shared directory and you stop sharing this add new things to it then these new things is encrypted with a key that the server still knows and of course the server can then recover this and i think it's important to point out that in both these scenarios the files that the server can recover are actually files that the user takes specific steps to avoid sharing so it's not just random files it's actually files that are the user doesn't want to share so this is all nice and so on so so this is the last password recovery or the second one the silent one i think this is bad enough of it itself but okay so after installation in order to avoid having the user input his password on every year you know every time he starts up the application the client will write out the user's password in plain text on a file locally okay i guess this is a problem if you you know are afraid of somebody stealing your computer or something like this so we analyze these rpc methods and we found an rpc method that given a file path will return this files content if the file path matches the regular expression and i guess you can see where this is going right the file path for for the file containing the user's password matches this regular expression and the server can actually just ask for the user's password and then it will the client will happily return this password or the file with the password in it yeah okay so i'll give my because the original title of this was something about some lessons so i'll give my because i think all of these issues reveal some interesting well known i think problems with developing software and anti patterns if you will so first of all i think this complexity plays a big role in this application you know all of these rpc methods uh different applications protocols authentication protocols and so on and so forth actually the last password recovery attack we found the one way you could just ask for it the fix for that was to just remove these methods because they were not used it was dead code basically uh there's this interesting issue i think of using the same secret for both authentication and encryption uh and all the password recovery attacks was actually enabled by you know using the password as an authentication uh secret if the password or if you had a separate password for authentication and encryption then um breaking these authentication protocols would not be that big of a deal because presumably all you would get there was then be encrypted data there's some some moral about making assumptions about where what the user should do or how he should act in different uh you know situations and in particular i i think the client should avoid making assumptions about the user and i think that should hold or holds for most most software so the application where the client has to accept this strange fingerprint or the context where the client has to accept this strange fingerprint might make sense in the real situation or the correct situation where this is run and it might make the application very insecure in the other one so and if you care about you know security against the server that can behave as he as he wants to then of course this is something you have to consider and you cannot make these assumptions that the client should know what to do in in every situation so this was what i talked about basically finally uh i think that the main quite or the main takeaway here it should be that these cloud storage uh services the reason they do encryption on the client side is that they want to provide something more in terms of security and i think it's it's it's sad that at least some of them actually don't have a threat model that captures this and it in makes it you know i think it makes it somewhat moot to to make these statements and make some i don't know empty yep that's it thank you anders so we have a minute or two for some questions questions upstairs well i have one was this all done manually by decompiling the python and and looking carefully no automated support to look at how 90 different rpc calls interact with each other yeah so this is manual work it's a bit sad because it doesn't uh generalize very well manual work as a spider or a user i appreciate it so thank you okay nice okay you should just update to the newest version i guess so all right so let's thank anders again