 So this is the title of the talk that I'm going to talk about today. It's basically a project that I worked on last summer as a part of Google Summer of Code. So I'm Damesh and I'm an active open source that contributed to Drupal code and I'm also an organizer at Injak Mentor of Code, a different version of Google Summer of Code that we organized in our college and I'm a developer at GSOC and also a mentor in GCI and I'm an undergraduate sophomore year student. So yeah, this is my email idea if you want to reach out to me sometime. So yeah, like I told you, the project that I worked on, this project was a part of Google Summer of Code that I worked for Drupal. So let's talk about what Drupal is at first. So Drupal is a content management system. So what is a content management system basically? If you ever need a website up and running and you don't want to write any code, so what do you do? You just install something that's called a content management system and you just install it and by a few clicks you'll have a website up and running without writing any code. So that's what a CMS is basically. Drupal is one of the most popular CMSs out there in the market and so one of the things that I love about Drupal is the huge community that we have. It is of over a million people and it feels great to be a part of that amazing community. So yes, who uses Drupal most of the government websites in USA use Drupal, Twitter, NASA and a lot of people and 2.3% of the worldwide web basically. So yeah, coming to the topic, the part of the topic of the project, the top of the talk away from Drupal, basically just to give a quick introduction about cryptography is because if some might not know what cryptography is. So basically cryptography and encryption used to be synonymous terms in a pre-modern era. So basically cryptography is basically you're just taking some text and converting into some gibberish text. That's cryptography using some complex mathematical irreversible or reversible functions. So what encryption is, encryption is basically the process by which you're converting sensible data into this kind of gibberish text. So yeah, okay so this is how encryption happens. You take some text that makes sense. You take a key and you put it into an encrypting function and that performs some mathematical functions like using ellipses like that is a part of ECC which is elliptical curve cryptography and a lot more other things that other algorithms that go behind the scenes over there. So yeah, an encrypting function takes some sensible data, a key and several other parameters and then turns it into some gibberish text that you cannot understand what it means. So once you have encrypted your data you have some random text and in the encrypting you take that random text, you take the key, another key, let's call this key to for instance and you convert it into text that makes sense. So there are two kinds of encryptions, automatic encryption and asymmetric encryption. So in symmetric encryption both of the keys that are used for encrypting and encrypting are the same and in asymmetric crypto both of them are different. So one of them is called a private key and the other one is called a public key. The public key can be put out in public and it can be just put up in some server or some place you can just keep it up there. So anybody who wants to send you encrypted data will take the data encrypted with your public key that is out there on your servers and then at a later point in time send it to you. When you receive the data only you will have the private key and you can take the private key and decrypt the data and you will get what the user intended to send you. So that is the asymmetric cryptography. So basically these are the two things that will be used in the architecture of the module to fit in as puzzles to actually make sense of the whole module. So yes, these are different standards in cryptography that is actually a meme. I don't know how many of you would get that, but that's all right. You get it. You get it. Okay, so yeah, these are many standards in cryptography. Okay, so there are a lot of encryption standards already out there and almost all of the applications that we use daily use some of the kinds of encryption. So there are a lot of drawbacks of those or maybe they're not properly as well implemented as they should be. Like there is no kind of a cryptography and there's no kind of an application currently out there which I mean non-zero knowledge which does cannot read your data on the server side. So yeah, so in the present architecture what happens is the encryption happens on the server and this way you need to trust the owner of the server. So basically if you're uploading some data to XYZ company and you need to trust them that they won't misuse the data, but at times it turns out that they do like recent leaks and all, you know, you might have heard about what happened with Facebook. So yeah, and back doors for some, like there are, at times some companies they leave out back doors in their software so that some organizations can access the data secretly and they are basically spying on you. So yeah, like some organizations which I shouldn't name but yeah, possibility of data leak Okay, so what is a zero-knowledge architecture basically? So Wikipedia says a zero-knowledge proof is somewhere like Wikipedia says that that is what the definition of a zero-knowledge proof is like. But over here what I'm using for a zero-knowledge thing is basically somewhere where the server would have no idea or no knowledge of what the data on the server actually is. Okay, so that was the GSOC project and this is, so the GSOC project is based on this idea of a zero-knowledge or no-knowledge system and so the basic objectives of the module was essentially to so that the file data cannot be accessible by anybody outside the intended group be it the server administrator or anybody who has physical access to the actual server. Yeah, the data is basically encrypted on the client side and it remains encrypted while in transit and at rest. So to explain the architecture of this module, I'll have to basically run you through a few keys that I'm using and a few terms. So this goes along with the architecture that own cloud is also using on their servers for zero-knowledge systems and a zero-knowledge protocol but over here so yeah basically one thing that we will have is a group key so the files that are being shared are to be shared only among that group so nobody outside that group should be able to see the decrypt the data so we're making a randomly generated group key that is generated at the time of the creation of the group or when the module is installed. So basically this is called a group key, I'm calling this a group key and it will never be stored as clear text on the server and it is stored as access keys for every individual user and at the time of the user logging in for the first time after the module is installed, we generate a public and private key pair. This public key pair is stored, the public key is stored on the server and the private key is stored locally and there are many ways to store but currently I'm using the STML5 API that stores the local storage API that stores the data in the browser so basically that is where the private key is being stored and access keys okay access keys okay so what is an access key? An access key is the encrypted group key for every user so every user would have one access key per group that they are in okay so yeah that's about it and the key generation basically when which key gets generated or the public and private key gets generated every time for the first time they log in and group key is the key assigned to a group and access keys are for every user so yeah this is how the interruption happens in this architecture basically the when a browser when a user wants to interrupt some data the browser requests for the for that user's access keys using a REST API and it fetches the key and then it also fetches the private key from the browser and it uses these two keys to generate I mean decrypt the access key using the private key to get the group key and it uses this group key to decrypt the actual data that he wants to see so that is how the decryption happens and in the encryption sorry that is how the decryption happens and this is and the decryption happens by fetching a key using the same mechanism but at the end it's just decrypting the data with that same group key so that is that is AES encryption so at the end I mean it's a metric one that I'm using okay so that was basic basic encryption and decryption how the encryption happens in the module but this module goes way beyond that because there are a lot of edge cases and a lot of cases where you don't know what would happen I mean what if a user is user is not there already and what if the module is already there and a new user has added to group in those things and yeah some basic behaviors yes so as I said before that I was using local storage HTML5 local storage for this for the module many people have told me that local storage is a volatile storage if you clear the data of the browser the key would go away so but how did I solve this so basically when the key generates you get a copy of the private key downloaded to your local computer and once that is downloaded you and at any point in time in future if your computer gets maybe it damages or something where you move on to a new computer using a new browser you have an interface that on which you can restore the keys basically and yeah what happens when a new user okay so the complete architecture is set up and all the keys are in place and everything is there so what will happen if I'm adding a new user to the computer if I'm adding a new user to a group the admin adds a new user to the group so basically when a new user is added to a group the first of all when they log in for the first time they will get a private and public key so they have those keys in place now now when the admin adds adds him to the adds him or her to the group there's this thing called as the pending keys table so that user gets added to the pending keys table and there's a request for a key okay I need need this key if somebody is online please give me a key so when another user comes online who already has the key for that group he decrypts and fetches the group key for that module and sees that okay this guy needs the key for that group he fetches his public key from that table and encrypts that his own group key and sends it back to him so he now has a key so he can access the data now what happens when a user is deleted when a user is deleted there's this hook there's a there's a hook that is called and the user's keys are removed from the from the table and it isn't currently done but it should should happen that the all the key should be regenerated because what what if the user what if the deleted user has a copy of his keys but this is not currently worked upon that much so yeah this is basically the all the dependencies that I mean when I started started off with this project there were I was my mentor gave me this told me about web cryptography API that is why that is a recommendation by W3C so I looked it up and and it turned out that the other functions that I needed for for my project were not that stable and it weren't working properly back then so after after that I I explored more and found a few more libraries okay but they were also based on this API so but it turned out to work somehow and yeah so I was actually using the JavaScript SJC LLC Stanford JavaScript Crypto library it's good for easy elliptic curve cryptography and AES but it does not support RSA so for RSA I I'm not supporting easy yet I'm on RSA so yeah for RSA I needed to add a dependency for JS Encrypt so I'm using that and yeah so there are there are a few trade-offs when we are using for going for this kind of an architecture so so because all this stuff is happening on the browser on the client side and the browser is not the an ideal place to do a lot of processing so it is way way more slower than it would happen on the server ideally and it and you cannot put a lot of load on it because it might catch if you use if you're having a having like very large fails and yeah the volatile key storage that I talked about if you want more details on the on the on the architecture and the progress and everything like that there's these links disease links and the slides links are over here you can access those from there and yeah so this is a demo of how it works but that's just a basic few clicks and a few like do you want to see it the demo okay I'm not sure if I have to wait no I think I have so this is basically our user logging in this generates a new key pair private key pair private public key pair this is the private key that gets downloaded and then what is this okay he logs out logs in logs in using a user that already has a key now I'm going to post some some content to a group and I own and I want that file to only be read by the users of that group so the file it's it's encrypted and uploaded and now I can access the file over here but if I log out it's still there and I can access it the file and now I'm logging out come on okay now I've logged out and it's the same URL so the file is on there and yeah that's it okay so that's it thank you