 Okay, so I'm going to cover a little, like, different thing, but then it's also very relevant and I'm, like, surprised every day by the amount of people who, like, don't, kind of, have that, like, interpret themselves. So it's, like, the same goes, like, there are two types of people involved, people who do backups and people that will do backups. Like, as soon as you, like, kind of lose, like, a lot of data and you realize that you've been gone at any moment, you cannot start doing that. But I need, like, to come back up with this stuff, otherwise it might be lost at any moment. Like, many, many things can happen. Your drive can fail. Like, your laptop can be stolen. Some, you can make a mistake writing some script. Instead of doing, like, an RM. Do a RM. There are many, many ways and you don't realize, like, you can lose a lot of your stuff. So I'm going to, kind of, go through some recommended guidelines and some, kind of, pitfalls that you can end up doing. Even if you are, like, doing some sort of backup and you think you're, kind of, doing it. So one thing that you can, like, you will see, mentioned a lot of times, kind of, the three to one rule for backups. So this kind of, kind of, this period of, like, you need, kind of, to recommend a way of doing it. It's like, you need three copies in two different mediums. One of them being offset. So what this is, kind of, teaching, like, like, saying is, like, you don't want to hold your X in the same path. So the meaning is, like, let's say I have, like, a hard drive when I'm making, like, we backups. If I don't have that A, it needs to be a separate drive. It's you're making just backups in a separate partition in your same machine. And your machine, it's stolen or something like that. There's, you know, you need, like, kind of, two different devices with this. You need, like, a server where you need, kind of, separate servers. And the second thing is, one of them, at least, needs to be offset. Because, like, you're going to still have, like, a separate hard drive that you store in a drawer in your room. But, hope not. But, like, if the house burns down, again, you just lose all the way. Like, that's the only thing you don't want to be happy with. So, kind of, the recommended strategy says have one, kind of, copy, like, on-site review. Like, that's really convenient because sometimes the only stuff on the internet can be really expensive, both in terms of money or in terms of time. Like, a lot of these, like, cloud services would be fairly slow if you want to silently download all your data. It's just, kind of, convenient. And something that you can quickly access and something that can be there just in case. Then, even if you have all of these, there's still common pitfalls. One of the most common ones is you need to be testing that your backups are working. Like, it's not enough to just, kind of, blind the trials of the script that you wrote. It's, like, it's, like, email it, maybe, outputting all these files to your terminal. It's, like, oh, nice, it's backing up. But, like, who knows? Maybe the server you're backing up to, you don't have permissions. And the data is being copied, the things in the output. But when it goes to write to the server, it's not there. And when you go to record your data, the data is not there. And there are, like, many, if you're a mystery, I think one of the most famous ones is the Toy Story 2. The movie was almost lost, like, for this very same reason. Like, someone accidentally deleted everything, like, the entire movie that was sitting in the server. And when they went to the backup guys, they were, like, oh, hey, the backup was not properly configured. We don't have your data. It was just, like, some coincidence that some employee had copied the data to work from home. But they could, like, silence the movie. So the way you should be doing this is, like, when you do a backup, it was a very, very tried recording data from it to test, like, it's probably working. It's one of the things that you should take into account. And the second thing to take into account is that having a mirror is, like, you need to be testing. The other thing is mirroring is not a backup. Like, if you are just making an identical copy for your file somewhere else, and this is kind of the same reason why, like, you would hear people say that, like, great striping is, like, having to this, that kind of mirror each other is not a backup. And this is because if you delete stuff from your home disk and everything is being sent and all the changes are being provided and you don't have any way of making snapshots, making versions, there are many ways, like, data can be corrupted. Like, your disk might be still working, but, like, some steps might be corrupted and you don't have it realized. And that would be propagated to kind of the backup solution that you have or maybe some malicious software just kind of replaces, like, a bunch of your files and files are still there, but, like, they just suddenly kind of push those changes. And that's what a lot of cloud services like Dropbox and Google Drive do, where they kind of keep around files for a while, but their main idea is, like, a mirroring solution. So if you cannot trust that in the long term, you can come back to buy you because, like, they only give stuff for, like, one more or something like that. So you want to use something that kind of, you have, like, two files and you have, like, you have, I'd say, like, at some point in time, T1, you have files AV and at some point in time T2, you have, like, files EC. You want kind of, your backup to kind of have versions with, like, oh, yeah, like, I have AV and I have EC. Like, if I want to go back in time, I can't record A. I mean, it's not lost. Along with this, one thing you want to be doing is the duplication. So this kind of versioning system can become really expensive if you do it kind of naïve. You just make, like, an entire copy of all the files that you carry and transfer it every single time you're just making a copy. You're going to end up with a lot of random information. And again, like, some of these services might be expensive when you start, like, copying files and files all over again. And these just touch upon, on this thing called simulink. A simulink is, like, a file that is saying, oh, like, don't mind me, go this, like, somewhere else to figure out where to file for relives. There's a similar thing which is called hardware. A hardware case does, like, in the same way that a file is just kind of a pointer to some place in the disk or, like, some, like, a bunch of bytes live. You can have, like, instead of having this setup where you can figure out, like, oh, this B thing is served between these two. So what you can have is, like, your version one is pointing to, like, you have A and B and your version two is pointing to B and C. And that way you only store B ones. So, and the nice thing about those all, if you delete any of the pointers, you can, like, all the others still work. It's really convenient. And of the last thing, kind of the last large point of the text form is encryption. Unless you control, like, all the serials where you're kind of storing your backups, think about, like, maybe your code is not sensitive or your photos are also, like, not sensitive, but they're, like, so much stuff, like your taxes, your, like, social security number, that you might not be comfortable with uploading to someone who's running just the third party in the club, but it might be, like, it's moving your data. But again, the thing is, with very little effort, there are tools, and I don't know if you can list some of them now, that can simplify this for you. Like, we can have this exact same setup where instead of, I can have, like, my client can have some key, like, some secret key and instead of storing A, B and C, what I, like, in simplified terms could be inserting, like, an encrypted version of A and an encrypted version of B and an encrypted version of C. This way the server can be storing this data and without the key, which is everything is kind of encrypted in the client's side, they cannot read my back, they will just have, like, some random information without the key, they won't be able to read it. And this is kind of compatible with the deduplication scheme. So you can still make this incremental change over time, and then we just be seeing, like, random stuff. Some kind of extracurricular series seems to think you maybe want to look in, like, you might not want the same backup strategy for your data, like, it's, maybe, you care more about, like, your documents, like your passport or, like, the taxes and stuff like that and maybe your photos, where you are fine only having, like, one copy of your photos, instead of, like, three. There are, like, also software that will allow you to make a bootable version of your system, so you can make an entire copy of your disk. So if, kind of, you lose your blog or, like, the disk, the hardware fails, you can replace that boot really quickly. You can have, like, an entire copy of your disk. Of course most of it. Any questions so far? The last thing I wanted to touch upon, and this is something that is becoming more and more common over time, is that you will be, like, there's a high chance that you will be using a bunch of web services and they will be storing, like, some of the data that you care of, like, have, like, this realization when I was using Spotify for a while and the beginning doesn't really matter, but over time you cannot fucking leave, so when you're in a world of ladies, they really care about it. And if Spotify, tomorrow or the Saturday, they lost the rights, unfortunately, they haven't backed it up properly, all my plays didn't work out, and I have no way to kind of fix that right away. And that's when, kind of, looking at, like, all these web services, like, for example, like, here, have Instabaker because Spotify is trying to bring you to backup, will you kind of have to think about the data that is living in all these web services and how to kind of back it up? Yes, in case. So you can return a lot of them, you can go to, like, things, settings, and you have, like, export as a CSV file. And, like, this will, like, yes. You just download a file to my computer, and I can open this in some, like, text, and, like, I have all my data, like, with all this stuff we saw from Dota Rendling, you can pass it and you can record it. You just still have, kind of, that piece of mind. Similarly, kind of, the other thing that you may care about, I mean, this is something that I learned about some time ago that was really good, like, sometimes you really like some content that is on the website. But it's not really easy to, like, kind of back up a website. You can always, kind of, save, like, all the files to your computer, but then it's not easy if you want to, kind of, set someone else. So there's, like, this awesome project which is run by the Internet Archive. It's called the Waypack Machine. And it allows you to, kind of, submit pages. And, like, they will take a snapshot and store it, like, for posterity. And they have many, like, petites of, like, save web pages, like, all the information. So, for example, if we type some, like, the NDB, you can see, like, all these snaps that they have taken every time. Like, if we go to, like, 2001, all these snapshots, for example, this, it takes a while because, like, they're not, like, using, kind of, the fast station processor for this kind of stuff. But, like, for example, like, we have seen, like, the, what the web product do of the A and the B in the B in 2001. And, like, you can look and, like, sometimes maybe at what rates that, like, someone has linked to people out there anymore, but you're going to come to these webpages and maybe someone has already taken a snapshot of it. And the same goes the other way. If you have, like, some, like, webpages that have some stuff, some stuff that you link to people often, you may want to back it up. They just offer you, like, you can, for example, if you go back to the main page, like, you can type here in the B.com, for example, but to that, like, it will take a snapshot of the pages that it is running. This won't work, like, dynamic, but, like, the kind of it's not a static webpages. It, like, if it's a dynamic webpages, it will only save, like, a status snapshot. And you can also kind of say, like, PDF, for example, link to PDF or link to an image, they will also save that. And the same goes, for example, for, like, MIT class, for example, this is this QT system class, you can see there's, like, a lot of snapshots that people have taken over the years, and you can just go back, and it's, like, sometimes convenient, like, you can see they always make the previous years, you can see, like, this is kind of what it was in, uh, 2012. And, of course, all of this stuff is going to work for backups, and we are already almost done having any poisons. So we stuff it out on the computer. Oh. Or you're going to do automation, right? Yeah, I'm going to do that. Yeah, I think we're going to do that. All right. And I'm curious, I have a question for the students. How many people here would not lose any data if you lost your computer right now? Oh, nice. No data, huh? No data? Bold claim. Yeah? So what do you guys use? Oh, yeah, yeah, yeah, probably to mention the data. I think we link that in the web page. And so this is like a bunch of software, but like, if you want something with it, if you want, like, all the features, I don't know how to set it up, like, the duplication, encryption, all the stuff, I think Tarzan app, and like, you also want kind of some remote, but like, because you don't have servers in the cloud that you don't want to set up your BBS and secure this kind of stuff around an entire class, like, you just continue in a server, it's going to be tricky. So Tarzan app is a really good kind of default solution that you can use. This is good for saving stuff that is, like, not media, like you don't want to be back in unblock some kind of photos or videos, because the pricing can be much more expensive for that, for kind of simple files, it can be really efficient. If that's kind of the same framework that we have over here, like, it figures out the data, it figures out the duplicates what are the new things, then it compresses, then it encrypts it with, like, a client-side key, and then sends that to that. And if you configure it properly, you can even configure it in a way that not even if someone gets above your computer can delete the the BBS, because you can only append to that instead of kind of overriding what they need. If you don't want something that you can control your own, there's, like, an open source tool called Warp Backup that's pretty much the same, the only thing is you do need to figure out what remote or, like, where are we storing the data, and that's this, for example, the one that I used, just have, like, a service in my friend's house in Spain, and, like, I just back up to that. And it allows you to kind of mount, like, mount views, all these different views into your file system and navigate them and, like, report files from there. And there's, like, these simpler tools where you kind of have to implement the address yourself, which can be more convenient. Like, I think it's just, like, a remote popping tool where you will figure out, like, between the remote folder and the local folder only the things that have changed, like, transfer level, but, like, it won't do the snapshots, for example. Like, you do that, it won't do that for you, and it won't do the encryption, which can also become tricky if you don't implement it correctly. And argon is kind of a similar thing. Argon is convenient if you want to be backing up to, say, Dropbox or to Google Drive. Argon is kind of, like, a command-line utility that works, like, async, but allows you to configure, kind of, all the APIs and stuff, so you can copy files and read files from, like, Amazon S3, Dropbox, Google Drive for this remote that, like, we all don't allow you to kind of SSA and copy files to SSA. Yeah. So just, can you just tell me what's wrong with the following setup? This is what I like, I just put all my Git repos inside of Dropbox, even though people say, don't do it, I've always done that, and it's amazing. And then, literally everything I ever do on my computer is just in the Dropbox that MIT pays for. And then, big files, if there are images that go on Google Photos, and I'm, like, allowing them to do whatever they want like, that's my privacy choice. And then, all the other data I have is, like, crap that doesn't really change it often. So I just have it copied on, like, a couple of disks. And I don't really have to do anything because that data doesn't really change. So your private SSH keys, for example, go and Git repositories in Dropbox. No, those are just, those are just saved on... Saved where? They're saved on, like, Git repo. Which is in Dropbox? No, they're just saved on, like, uh, yeah, so I have a private key that hides all my private key files that are stored on Dropbox. And where's that private key stored? That's just on, like, a couple of flash drives. On a couple of flash drives. Yeah. Okay, so that seems like the start of a problem. Really? No, I mean, I have my private keys encrypted on flash drives around the place, too. So there are a couple of issues you want to do. One is, you're basically, like, trusting everyone with your data. If you're willing to trust people with your data, then it is trivial to do backup. Like, there are tools that let you do, like, full disk backups. And then everything is safe, but you're uploading all your files. If you're willing to do that, then backup is very easy. It does mean that you'll backup a lot more than you might care about. For example, I don't include my Git repositories in my backups. Because they're already in Git, right? They're already pushed somewhere. So, my concerns are more things like doc files. I backup all of my email. I download my machine and then include in my backup. Because in case my email provider goes away, or I stop trusting them for whatever reason, all of my cookies from my browser, I backup. In addition to that, I backup just, like, various, like, scans of passports and stuff that I don't want to leave in Dropbox. Those I also keep in encrypted backups. So it really comes down to trust, right? It sounds like this is a joint security and that. Oh, absolutely. If all you want is backup, then that is trivial if you're willing to trust people with your backup. Although the thing that kind of, like, Dropbox doesn't interest me becoming an issue, Dropbox is always mirroring. If you change a file and you want to recover it later, they won't keep that around for a while. So, to take an example, if you do worry about security, imagine that someone steals your laptop. Can they delete all your backups from there? And is that a threat model that you worry about? No, because Dropbox saves your history. If I have your laptop, I can delete the entire version of Dropbox too. Even in history? Yeah. I can wipe your Dropbox account. Like, I can delete your Dropbox account. By going into, like, the browser. Yeah, sure. Not with Dropbox. That's fine. I can have your laptop. I can do whatever I want. That's just a password issue, right? Okay, where do you store your passwords? Somewhere that's accessible from your laptop? It sounds to me like this is a security problem. No, no. Well, I mean, yes, you're right. All of this is about security. If all you care about is backup, it is trivial. All you do is just mirror your entire disk to somewhere online. So if all you want is backup, the problem is trivial. That I totally agree with. The only issue is, like, yeah, somebody accesses my browser, but that's because I did do a good job of making my browser inaccessible. No, someone runs away with your laptop. Yeah. Now they have access to, like, anything on my laptop, but I don't see how I use any of these programs for backing up. It's going to prevent them from... With Tarsenup, there's no way someone can delete your backups, even if they have your laptop. The cryptographic key you get on your laptop is append-only. It does not let anyone delete. You're signed in to Dropbox as a password. Okay, where do you store the password? I don't store it. You just remember your password to Dropbox. That suggests to me that it's not particularly secure. It's pretty good. So you're saying you have pretty good passwords for all sites that are not reused? For Dropbox and, like, like, three things that have really good passwords. I just force myself to memorize. Sure. Yeah, I mean, I agree. If you're willing to, like, remember long passwords securely and you trust Dropbox, that seems fine. Okay. So, yeah, all of this comes down to security and trust. Yeah. So I'm not willing to trust anyone with anything. Yeah, I think that's a fair example. But that does make my backup strategy also more painful. Speaking of passwords, where do you also store your passwords? I think that we wouldn't cover that in the security lecture. I think all of us use password managers. Yeah. I use the, what's it called? Patrick's Standard Password Manager? The one that's called Pass? Pass. I use one password. Yeah, I use Bitward, I think. Which is, like, open search. And, like, all the different stuff. And then I have, like, some stuff on, like, fast logic. It's stuff that I want. Like, it's scripts in my system to be used without, like, reading, like, a plaintext password directory. And I think that we can it's going to create that the kind of someone running away with your laptop. One thing about the encryption, same applies to kind of your hardware. Like, even if you have, like, a computer, like, more or less, like, you have, like, some password with a computer. If they have, like, hardware access to your computer and you don't have all this encryption, they can just easily get into, like, some user privilege mode overwrite the user, like, read all your files. And nowadays, it's become easier and easier and, like, Mac has, like, five walls on, like, Linux, like, when you're installing to use looks for encryption. And, like, the hardware spread. Like, I haven't been using this for years and I haven't seen any slowdowns being because my entire hard drive is encrypted. And I think that you might have said, like, the kind of the OS is doing some authorization but it's not doing encryption with your files at all. When I'm taking a five minute break now I keep answering questions but if anybody wants to take their break they're welcome to. I'm wondering for you to, Mac, do you use Time Machine and, I guess, like, do you have thoughts on Time Machine? I have used Time Machine in the past and it was, like, it was so finicky sometimes. Like, when it worked, it was awesome. But, like, sometimes it was, like, extremely finicky of, like, how would it be, like, because it couldn't fight, like, the Time Machine anymore. And as soon as I started I kind of realized an IO control on the, like, encrypted backup so, like, someone would not, like, delete the stuff because he would need, like, to access server and, like, also know, like, couldn't read the stuff even if they got, kind of, my Time Machine. And I stopped using that instead of, like, more. It's convenient. Like, I think time, like, kind of on being like, okay, it's a good solution that implements and, I think, that's in the notes. It's a good solution that does the kind of the versioning. Like, with most kind of backup tools that are near yours and, like, Time Capsule does give you versions. And, like, that's, like, a really good idea of having, like, versions of your OS because if you, like, delete files you might be lost forever. And that's just losing data. That's kind of the opposite of backups. So it's kind of a good solution, but it still has its own comments. Have you reduced it? I've used it a long time ago, but I've switched to other tools since then. I, like, want data to be back to multiple servers. And I have a bunch of my own servers, some here, some at my house. And, yes, I don't use Time Machine anymore. I think it doesn't let you have multiple backups, right? Or at least back when I used it. You could have a single Time Capsule, but I couldn't say, like, I have five Time Capsules, like, synchronized all of them or something. Yeah. It's funny because I do use Time Capsules but I just run all this stuff. Like, I just have, like, a, like, a proxy server in between and just, like, mail or mount the, the, the disk. But you, you can use, like, Time Capsule and it doesn't know what disk and won't be able to, like, transfer it back. It's pretty good.