 Hello there, you're very welcome to how to back up Linux based computers for beginners on YouTube My name is Daniel Russell. I put together these this series of presentations as a YouTube video It's also going to be going up on to udemy.com shortly as a resource over there as a free course for anybody interested in the world of Linux backups So what this is there are a few videos on YouTube about backing up Linux and what I actually wanted to do in this one In this particular video was to give a presentation It's going to be a series of slides like you're seeing here and I'm going to discuss Basically give an overview for people that are relatively new to the world of Linux or Linux rookies or as a buntus Slogan goes Linux for human beings. So I consider myself Linux human being I'm by no means a you know gen 2 user or super advanced even though I've been using Linux for quite a while But I do use Linux every day and I have been doing that for more than 10 years. So Linux is really the very core of my computer experience second nature to me and I developed about I've developed a backup strategy Primarily through necessity actually so one of the most one of the Frustrating things that I found with Linux over the years Was that Linux can be quite buggy especially relative to Windows? I think that's kind of a well-known thing obviously is an open source operating system There isn't the same kind of development and QA resources going into Linux even a buntu So Linux has certainly taught me a tremendous amount about computers and as a freelance technology writer That's knowledge that I use almost every day. So I'm very very grateful to Linux and to open source, which is partially why I'm open sourcing this and This presentation why I'm open sourcing my github documentation Why I'm just open sourcing in general all the stuff that I'm writing I'm just documenting now my backup routine that I've been using for three years already And I've been actually doing Linux backups for longer than that But I think it's really important for people that are interested in Linux Just to get this knowledge so Linux has been great for me as I said But the the real pain was that as a somewhat unstable operating system periodically it would just break down and As I said not being a super linux user there were times where after debugging on You know a bunch of forums and ask a bunch to unread it I just ran into impossible problems with the package manager and as I've grown older So I started Linux started using Linux when I was 19. I'm now 31 And I've just had less and less time to devote to issues with Linux I just basically needed to work And that's that's kind of what I love about it that if it does work. It's a super Super stable system. I'm using a lightweight system here a bunch to an lxte and When it works, it just works for years at a time if you stick to these LTE upgrades But that's it's still really important to have a backup strategy so it was about three years ago and I had a succession of reinstalls which were ultimately due to Hardware issue But I remember saying to myself and the interesting thing is the better you get at Linux and keeping your system in good Running order by which I mean a certain best practices Like minimizing your use of third-party PPA repositories sticking to those LTE upgrades Stuff like that the better you get and the more proficient at using Linux ironically the more annoying It's going to be when your system does break down and you need to install from scratch because in backup terms your RPO your it's going to be getting continuously Longer as measured in the time your system is running and what that the point I'm trying to make is that you will be making Figuration changes adding programs over time over time those will accrue and when your system finally bites the dust And you have to install it from scratch. It's going to be a massive. So I've been through this process and The last time it happened I said I'm either gonna going to start using Windows or I'm going to Develop an excellent backup approach for a bunch of Linux and I'm not saying that my backup approach is the best backup approach out there certainly not It's actually quite a basic backup approach But that's actually why I'm sharing it because as you said Linux for human beings so I did that that's where my interest in backups comes from and That's why I spent a few months recently Here and there documenting stuff on GitHub and medium just to basically share this info with anybody else who loves Linux gains from Linux, but Instability and you know That's a big issue for them and of course, there's just OS agnostic reasons why a backup routine is important Whether your windows are max you also need backups So that's kind of the subtitle I have here how to keep your Linux desktop safe from disk failure human error and upgrade disasters So I intentionally included here Both hardware and software malfunction. So Disk failure is clearly a hardware malfunction. And that's certainly something that can happen to any Linux user Human error and upgrade disasters is kind of more Linux see that if you're using Ubuntu and getting those periodic upgrades. That's a vulnerable time for most people systems and That's when if something does go wrong, you'll be very very happy to have Watch this video or another video or documentation and got a good backup strategy running for yourself. So who am I? This is my Headshot here that needs an upgrade My name is Daniel Roso this this to the to the right is my github Repository and if you're interested in Linux and backups, please feel free to follow me at Daniel Roso JLM That's all together. I'll be putting it in text here And as you can see, I have a few repositories Where I just attempted to commit into markdown what I'm doing regarding backups for other people So as you said, I've been using Linux for more than 10 years I consider myself a former a reformed Linux serial Linux reinstaller I've been mostly using Ubuntu During this time. I use Ubuntu with Alex D. E. Which is a very lightweight Desktop environment. I did used to use Lubuntu before they gave up on Alex D E. I went over to the dark side of Alex QT I've also used Debian Mint. I've used Fedora a little bit I played around with Gen2 and Arch, but frankly, I'm too stupid for those operating systems So I just stick with Ubuntu for the most part day to day I'm not a full-time backup person or even a full-time tech person by any stretch of the imagination In fact, I am a humble technology writer and that's my website Not this isn't a plug for my writing business. It's just to say that take everything in this Presentation with a grain of salt. It's for Linux Linux Beginners, I'm consider myself a relative of Linux beginner and I'm not a full-time You know, I'm just an average Linux user essentially So why would you want to back up? So we've covered hardware failure Natural disaster is floods, fire, etc. This can happen to Anybody, whatever operating system you're using data protection now what I want to mention here just at the start before we get into the various types of backups and Then we're going to talk about the different Linux programs The first thing to say when you're beginning to think about backup and what type of a backup plan you might need to devise to protect your data is Looking at how you compute. So that's unique to people. So, you know on Linux and you're an average Linux system You have your user directory files and that's pre-populated. It was these Default folders that, you know, Linux shoves in photos videos Documents so on and so forth and then you have the rest of the system So you have your system files from the root of the directory and That contains some Packages and just basically the rest of the operating systems in there. So I personally Don't really use don't really do much in the user directory So I've been in the habit of using the cloud Wherever possible for the longest time as long as I've been using Linux It just has made sense to me to do stuff in Google Docs versus LibreOffice wherever possible I Dream of the day when there are no more operating systems or everybody has some really lightweight front-end I think it's called thin computing this idea that's been thrown about never really took off but that's kind of always been my aspiration that I really don't do or have much on my User directory and basically there's a few config files and everything's in the cloud But so that's what I do This green cast I'll be putting it up to YouTube and just deleting the file and I'll back up from YouTube to My backup system, but I'm not going to be even keeping the this video recording on my computer because I Just get everything up to the cloud as quickly as possible So that's the way I compute and therefore for me protecting the user directory Just doing that which is what a lot of people recommend Wouldn't make much sense at all because there's really not much there to protect, but if you are You know doing most things locally if you hate the cloud You're the complete opposite of me and you've nothing in the cloud and you don't trust the cloud or whatever your reasons might be and You are going to then have everything in your user directory And you will certainly want to back that up and you may not change Files configurations at all and you may not be interested in backing up anything but the system files Just worth pointing out that most config files for stuff like LibreOffice and OpenBox are actually nested within the user directory But there are some things in the Linux system configuration files that are not there So if you really want to get everything you you should Look at scooping up the system or most of the system, which we'll talk about So why back up some more reasons accidental deletion human and programmatic via API integrations in other words as just as much as You know a bad not a bad Linux upgrade But something that doesn't work with your system can end up really damaging your package manager or some other integral component Equally you could Do what I do and you know have one click Delete enabled and accidentally delete your entire home folder and one click and not be able to recover it So human deletion and human accidental deletion is always a reason there as well Linux Pacific reasons really the upgrades the brick the operating system And that's that's particularly the case if you're using the bleeding edge non LTS releases in a bunch of those are less stable driver incompatibility for me has been one that I've you know if I've upgraded to Changed out the Nvidia driver gone for a better one a more recent one that has caused problems or I've tried to switch back So these are all reasons Prevention is better than cure so if you can avoid getting into a situation in which you need backups It's better than even having a good backup strategy. So protection against Accidental deletion This is the same slide over twice. So the Udemy Udemy video will have to fix that And that really covers it anyway in terms of reasons to back up your computer You have hardware failure You have natural disasters you have software issues and you have human error Those would be the main categories of reasons So what's inside a Linux file system that you might need to back up so you can just do an LS Command on the root of your file system to see what you've got in there But the important thing to say is if you are backing up the whole system Using something like our sink you don't actually need to capture Every single file folder sim link you don't need to get everything in the system So as recommended specifically to include exclude The following folders firstly is device contains device files Proc is a virtual file system containing kernel and process files Cissfs is also a virtual file system. That's that's actually just sys Temp is as the name suggests temporary files Application start files mounting the files mounting for mounting file systems. That's M&T and media is where if you do something like stick in a USB Drive to your Linux system. It'll be mounted into a mind point in there last and found Contains recovered corrupted files So you don't need and you actually shouldn't get those things because if you include these in a In a restore it has a potential to screw things up So you will find different lists and different suggestions for what should and shouldn't be included in a full system Linux backup So, you know by all means do your own research But these are generally the ones that it's a good idea to exclude in a backup So in terms of what I'm going to be going through in this in this as I said open source presentation here high-level overview of Linux backup solutions Beginner users and I'm going to be going through the common of onto backup tools GUIs and CLIs going to be looking at Clyde backup storage. I'll also be just over viewing some Main backup concepts backing up local destinations very important the 321 backup rule Devising and implementing your personal backup approach and I'll give you my github again Daniel Rossell JLM I can also be found on medium just search for a name and then also on LinkedIn Now I want to cover some backup fundamentals to help you protect your system and So what is a backup a definition coming from? Technopedia backup refers to the process of making copies of data or data files to use in the event the original data or data Files are lost or destroyed Second secondarily a backup may refer to making copies for historical purposes such as for longitudinal studies There's some more like archiving I'd call it statistics or for historical records or to meet requirements of a data retention policy So you've actually got two separate use cases the first one is a classic one we think of when we think about backup prevention of the loss or the destruction of data and that's really a miss Risk mitigation Strategy if you think about it the second one is I've called it data retention and compliance that you might need to retain your data and archive Your data You know this probably doesn't apply to individual users But if you're a company to adhere to some compliance standards, so There are three main backup methodologies firstly you have full backups and I'm going to explain what each of these means You have incremental backups and you have differential backups so if you need a memory aid to to Remember the three types you can think of think about backups as a way to ensure data Fidelity and you get it here full capital FID full incremental and different and differential So let's look at the first backup methodology now I'm providing this information because The main point that I'm trying to going to try to get through in this video is just to say that yes There are a ton of different backup tools for Linux But when you break them down according to whether they are full incremental differential deduplicating Whether they go to local remote if you just use these evaluation criteria it becomes a lot easier to actually understand the differences and You'll actually see then that there's a lot of overlap, you know, and this is just a Feature or perhaps you could argue a bug of open source that you know people will come with their own Projects that copy the next one. So we end up with quite a big list of backup tools for Linux But ultimately if you understand these concepts, you'll you'll you'll have a good grasp on what the difference is between any Two tools would be so a full backup is actually really easy to understand if you have a computer here on the left And it's got three image files one two and three just three different. Let's say png's jpegs doesn't matter bitmaps Now you could just simply copy if you want to back up that folder containing those three different images you could just copy and paste that folder from the computer onto a Target it doesn't matter whether we're talking about a Second internal drive or it could be an NES or it could be, you know, it could be anything That's not that's not what's important What you've done is just copied the entire folder So you've just taken a full backup of the folded images restored in one two and three have been synced over to one two and three So that sounds nice. It's simple, but there is a problem to full backups Let's say we added a fourth image to the folder that's called that four dot png now Clearly the smart thing to do in this instance would be to simply move For dot png from the source, which is our computer over here to the target, right? There's no point in copying one two and three because they're already on the target But that's not how a full backup works full backups copy the full file system from source to target Each time they run. That's what they do. So in this instance, we would be recopying three image files one two and three dot png Unnecessarily when we run the backup again and what we would describe that as is it's got a inefficient or a high data transfer overhead so you know in terms of Copying on our local network That might not seem like such a big deal. Well, we're just going to be copying it You know intuitively that sounds kind of stupid, right? We have three files and we're literally just going to be Overriding Three files on the target. So we're just going to be transferring three chunks of data Across an ethernet cable for no reason, but it doesn't really make much of a difference if you think about it, right? I mean, it doesn't really cost anything for us to transfer, you know, a few kilobits a few kilobytes of information But when we're talking about cloud computing and cloud storage where we might have Ingress fees were probably being charged for for read write Operations in that kind of a case you can see where full backups begin to make less sense or be more problematic This is particularly the case by the way if we're talking about backing up something big So, you know in this case of three Little images you might think well, is there any matter? We're just going to copy over like, you know, it's going to take a few milliseconds to copy these images again but This is just a small example when we're talking about backing up an entire operating system or an entire Bank of dozens of operating systems. This this would be talking about, you know, potentially terabytes and Pedabytes and that kind of scale. So then it then it really does become quite quite wasteful So a differential backup in the other hand is kind of the next best thing if you want to think about it like that It syncs changes between source and target since the last time a full backup was run, okay? So differential it runs a full backup The first time it runs and then every time it runs You know, if you run our sync the first thing our sync does is there's this long hanging process where it's Looking at what's on the left here on the source It's looking at what's on the right here at the target and it's basically saying, okay guys What what's new what's been moved? What's been added? What's been deleted? And I'm not advanced enough to understand the technicalities of the algorithms of the checksums of how this Process works on a granular and deep level, but that's basically what's going on and it's looking at what's different and When you have a differential backup every time it runs, it's going to copy over the changes now That's actually a good thing because if you think about it, you only need the full and the differential if you have to a restore You need the full backup and then you need the differential and If you put one plus two together Between the full and the differential you have enough information enough data to guess back to That point that the differential was taken on the restore point So the advantage there is that you have less dependencies and incremental We're gonna see what incremental is next and an incremental means you have a chain So well, we'll talk about in the next slide But the the to the second advantage of a differential backup is that there's less dependencies So you only need to you don't need a whole chain and the problem with the chain is that if you have One bad incremental one a part of the disk sector that might be corrupted where that incremental is You could run into problems the disadvantage is that relative to a Incremental backup each backup is going to be heavier than an incremental that just is intuitive Incremental only moves over the changes since the last incremental was run Hence you get a chain of incremental backups going back to an initial full backup The advantage of this is that let's let's think about it in terms of slices. Okay, each slice every time the backup Program or the backup script or whatever it is every time that runs the slice is going to be small if you run an Incremental backup daily, which is what a lot of people do. Let's just take our sink is a very simple example You're running an our sink from your computer over to a local server or an NES or whatever the case may be You might only be moving a very very small amount of data. Maybe you created a couple of folders in In my pictures, you know, maybe you installed one program But it could just be a few megabytes because it's not very big But you do get this this chain Disadvantage versus incremental is that all the incrementals between now and your desired restore point at least in theory need to be intact in order for the restore to Successfully perform or to perform as well as it is it should then there is deduplication and the difference between incremental and deduplication is complicated and Again, slightly at the edge my edge of my knowledge So only backs up blocks missing since the original full sync deduplication is also good for creating various versions Which you can roll back to So in terms of backup approaches to two main Types now we've gone through the three types of backup and then you have as well Just this is kind of one final thing to know and incremental concept can be taken even further so if we have on the source a few file changes and As we've seen in order for the backup to be as lightweight and as efficient as possible from a data transfer standpoint We only wanted we're only going to be syncing the changes between the last time that ran Well, why not take that concept one step further unless just sink the bits and bytes the data that Changed within the file. So we're talking here about block level backup algorithms only transfer the data That has changed between runs this layer is beneath the file system. So you have data Information and that information is aggregated into files And then you have folders and then in Linux you have symbolic links and all that together amounts to a file system But data is at the core of that if you think about it, it's underlying all that So when you do block level backup, you're just moving over the data So you can actually that takes efficiency to like the next degree and you can run our sync like this It's got a delta algorithm and you will just be scooping in the changes and data and again I'm not a backup expert and You know the nuances of how the our sync Delta algorithm Transfers at the block level are very deep and there are people that are familiar with that kind of level of detail And when it will be able to sync Delta and when it won't but just suffice to say that's the idea here so that's the difference really between file level backup transferring files and Block level backup block level backup is faster more reliable commonly used in disk Imaging so we'll talk about clone zilla later and clone zilla blocks up in sorry backs up in blocks and And then it just kind of puts those blocks onto constructs an image out of those blocks usually it can also do to Construct the file system Our sync can run block level syncing if you wanted to Okay, now we have the three two one rule of backups the essential rule of backups backup best practice calls for backups to be replicated twice Now this is where I think the three two one rule is a little bit confusing because what it actually means the three at the start of that rule is the three kind comes from your Operating system and you have two different copies. So two plus one is three right one copy should be off-site and The other off-site now the two which I didn't write out here in the three two one backup rule That means that the two Two copies should be on different storage media. So let's just take a look my screen here We have next to the Linux next to tux over here. We have a clip art picture of a hard drive And we have over here a cloud and we have over here a Vault now actually people do store their backups in bank vaults. It still goes on believe it or not as their off-site location so We would be replicating this This data we don't want to really have the backup the primary the backup source and Any copies on the same disc because if the disc fails we lose the backup so firstly we want to have the primary and backup copy one on two different storage media and then to take that concept further One of those is going to be on-site which that means we're going to it's going to be where we are physically located and One of those is going to be off-site and what we want is for the on-site and the off-site now This always confuses me because technically if you think about it They're going to have to be on different storage media just because they're on-site and off-site But in any event the best practice is to have all these three backups on different storage media and Off-site one. That's the one in the three two one rule Is that one of those should be off-site so it doesn't actually mean it has to be the cloud It just has to be off-site and again This is actually where a bit of subjectivity enter is here and you know you could argue What how far from your home is good enough? Off-site for something like a natural disaster, right? So I would say realistically I try to take a common-sense approach here In the sense that you know if your home is destroyed in a flood I think in that instance you're probably going to have bigger problems than data protection if we're just talking about your computer here Presumably your cloud data is okay If it's just your computer you have bigger fish to fry in my opinion But it's nice anyway to have an off-site just in case that does happen Goodness forbid and you are able to go over to your friend's house Pull out your backup tape or go into your car and restore Then you could really take it to the next level and say well What if the what if your whole neighborhood is decimated in a tornado and I just say as we say in In the in the Jewish Talmud there's an expression that says Al-Ahad kamavakama, which means all the more so so in other words Why what I say to that is if you have bigger fish to fry when your house is flooded If your house is flooded and the neighborhood or city has been wiped out in a tornado I think you have much much bigger fish to fry than restoring your computer, so That's what I would say there is that be practical people have different perceptions about how far of a radius is good enough You know so you can take more than two copies of 321 rule You can be the 432 rule or the 543 you can go crazy if you want to The off-site copy exists to protect against certain disasters, right? So if your house gets flooded This is really what it's therefore. It is of course credible, but Astronomically unlikely that you would have simultaneous disc failure in if you had for example If you stored you if you had your primary computer and your backup on your NES and both the NES And all the discs failed all at once Radar was useless to you and your computer failed and that all happened in the same five second interval That is possible very unlikely what you could have more credibly and this is a reason why for on-site backups I actually store my on-site cold you could conceivably have a situation in which you had some kind of massive electrical event in your home and both the NES and your computer were We're sort of fried by an electrical overcurrent in which case again you'd be out of out of a backup So that's why the on-site copy at least at the very least should be on another physical drive From the primary device so that doesn't give you if the computer if you have a desktop computer and it's fried and It's fried the electricity Overcurrent comes in Get through your surge breaker your NES whatever you have and just destroys so that will not be enough Protection if you have it on it in a different physical computing device such as an NES or a dedicated server The server might survive the computer won't and if none of those if everything in the house gets Damaged by power then you're down to your off-site. So that's really What that is? RPO and RTO are two important concepts and backup the RTO stands for recovery time objective and RPO stands for recovery point Objective and we're gonna I'm going to use tech target here for the two definitions Recovery time objective is maximum tolerable tolerable maximum emphasis on the word maximum Maximum tolerable length the time that a computer system network or application can be down so You set these yourself if it's maximum according to what you deem to be maximum or watch your business use case deems to be maximum so in the banking industry for example the RTO Might be an hour because banking is it could be it could be five minutes in fact so there is a big difference naturally between Consumer backup and I'm recording the screencast for Linux users protecting their own computer here, right this this is not I Don't know remotely enough about backups to be able, you know to venture into suggesting what a bank should use but Just to just explain the type of situation in which RTO and RPO would be might be very very stringent Would be those kind of enterprise data protection Compliance Mandated recovery recovery times, but they do matter here too because it provides they're a very easy way to compare Backup approaches. Well, what's the RTO? What's the RPO RPO age of files? It must be recovered from backup storage for normal operations I'm gonna explain what this actually means in simple terms the RTO is basically how long What's how long it takes? For you to get your system back. Now. Yes, that's the maximum time that you allow it in practice. It's As I'll explain in the next slide, let's just actually jump straight to it You're supposed to really calculate your RPO. How much data what's the maximum amount of data? I would be okay with losing or I can lose. Well, that's your RPO What's the longest I can be without a computer? That's your RPO now Go find a backup strategy that meets that RPO and RTO. That's how you're supposed to do it Or you can just more credibly perhaps in the consumer Environment you can see Something that roughly fits your these objectives and what you can do what you can afford to do because If you get these RPOs and RTOs down to their down towards the minimum, then you're going to be Increasing expense really increasing complexity so The point that I'm trying to make here is that You know, you could Run a incremental backup every five minutes if you wanted to to create a very very slight RPO for your first recovery point But that would have an overhead on your system It would probably slow it down slightly to have our sync running in the background as often as every five minutes So you really have to think what is okay? So I'll tell you what I do practically speaking. I have our sync Arcing front-end called time shift, which I highly highly recommend amazing backup program I have that creating a daily Weekly and monthly snapshots for me. So my smallest my most frequent snapshot is a daily and That's okay So that means my RPO for that approach is a day the RTO is based on how long it actually takes to Roll back the system and for my experience, that's about five minutes So the RTO is very low if something goes wrong I can get my system back really really really quickly and For me the RPO is fine in most cases. I don't do anything crazy on my computer in a day I might install a couple of programs on a busy day or I'm changing stuff, but I can live with that It's not it's not going to be anything compared to the previous instances I described in which you have a bunch to running for two years You upgrade a bunch of stuff, you know put on your programs customize things then one day it goes brick You have no backup approach in that case your RPO is going to be two years So one day versus two years take your choice there So we're focusing in this This presentation here on protecting a typical desktop or laptop running the Linux operating system Practical example RPO can be quantified as days data lost when you see comparison tables for backup approaches They're typically quantifying RPO as days data. So five minutes wouldn't I Hourly would not even be you know, you have to create a fraction of that RTO He can just use days for the two of them So it's just to give a practical working example if we use Borg to back up our desktop We run the job once a night at midnight. We know from our test restore that restoring a backup point takes about 10 minutes So the RPO in that case is going to be one day Okay, and the RTO is 10 minutes. So it takes 10 minutes to get it back. We're running that thing every day so the maximum time and focus to get on the word maximum we take the backup and One minute later or 30 seconds later Something happens our system and we need to back it up or sorry. That's that's incorrect 23 hours and 30 minutes later just before we're about to take the next backup We have some catastrophic system failure. What we would need to do would be to Restore from the last backup. So the maximum time that can elapsed between the two backups is going to be the RPO there so that gives us an RPO of up to one day and In fact, that's probably a better way to describe it, right? And you know RTO is probably more like that's what it actually is RPO you can use a word up to before that Okay, now that we have some backup Fundamentals under our belt. Let us take a look at the various Linux and this is just a very quick whistle-stop tour of What's available in backup land as I call it so backup land in Linux you have command-line interfaces and you have graphical user interfaces CLIs and GUIs I'm just going to go through this a bit quickly because you'll Probably the first thing you have done before even thinking about backup on Linux is you've googled Linux backup tools and more than likely you've come across three or four articles Saying something like 20 best Linux backup tools 50 best Linux backup tools and you said oh my goodness There are so many different backup tools and that's why I call this backup land because it is confusing But once you break down The what the tools do according to the information that I've discussed here. It actually becomes a lot less confusing So when it comes to backups, basically, there are many ways to skin a cat, but the ultimate skinning That you're trying to do is to comply with the three two one objective That's just the best practice very very hard to go wrong with that. Yes It's nice to have encryption if you're backing it up off-site or to the cloud It's probably a good idea to encrypt your backups whether you need to encrypt your local backups That's up to you. My opinion. I'll give it quickly is that if you're not encrypting your disk Which you can do you can use full disk encryption, but if we're not doing that then It doesn't in my mind really make sense to Encrypt your onsite if you're not encrypting your operating system itself because if you were burgled if you're worried about Physical access data theft somebody could just pop the you know The the SSD the storage out of your computer and rob your data I wouldn't really the if your backup is encrypted that wouldn't help So those are the kind of decisions you have to make do you need do you need compression? Do you want DGP duplication? Do you want incremental? But there really are many ways to go about it and they all do this in slightly different ways And it's kind of one of the downsourced the downsides of open source the bug of the system per se that we have such a wild Proliferation of different backup tools. So there are a ton of great backup tools and there's a lot to know about backups I'm still learning. This is very true by the way as I said, this is a beginner tutorial for other beginners Sorry, I want to roll back on that word tutorial. I'm just going to call this a how-to because It's really kind of peer-to-peer or how I intended this But you can keep things simple and compare and contrast options if you evaluate according to the simple questions Whether it's incremental differential or full and that's why I've gone through this information What's running under the hood? Is the backup compressed? Is the backup going to be encrypted? and another question that might be worth asking often is Whether the backup tool can be run on a live system So clonezilla for instance has to be run from a live USB Most other tools you can just run from aboard the system, which is nice. It's luxurious You can be using your computer and in the background running a scheduled automated backup through a GUI or through a cron job And that just takes care of everything automatically But clonezilla if you want to go ahead and image the disks you actually need to have you need to basically stop using the computer put in your live USB and go through that process and your computer is tied up It can't be used. Well, that's going on. So Then you should also ask is this going to actually Contribute to my backup strategy or will it create needless duplication? So you want to be smart about it and say, you know realistically I don't think you'd need more than one onsite and one off-site backup methodology. It's just not necessary. So A good evaluation point is just saying well, does it do those two things and Do I have my one onsite? Yes. Is the RPO acceptable? Yes. Do I need one snapshot or do I need multiple snapshots? Do I have that off-site? Yes. Is it encrypted? Yes or no. Does it go to cloud storage? I have a plan for it. Yes or no. Okay, and then you're done That's it. If you're interested in backups, you can keep jolving through this this whole world But in terms of getting a basic system set up, you only really need those Pieces of the puzzle. So our sync is just in terms of the CLIs and again a quick overview here Extremely powerful and versatile backup CLI with a ton of parameters there It's running in a lot of different backup front-ends GOIs and You can use our sync for full incremental and differential and you might be saying Daniel you've made a mistake our sync is an incremental tool That's that's true But you you can't just run our sync one time only and create one backup point in which case if you just run Our sync ones from source to target. You're just going to have one one backup essentially and you could you could create a few different backup Folders on the target and run our sync between each folder And they will run incrementally if you run them multiple times, but if you just run them on one instance at one point in time So basically you can use our sync and that's why it is used in a bunch of tools because you can just you can do just about anything with this Using using the algorithm that it uses which is a delta Syncing algorithm, so it's not G D duplication. There is a difference between our sync and Ddupe But our sync and you know some people will tell you they prefer D duplication and the difference between the two is quite technical So I'm not going to get into that but just suffice to say that our sync is really super versatile It can be used in block level, but it doesn't support encryption natively You can our sync into Anything with an our sync server So you can have you can have our sync running on a web server in the cloud on a VPS server Synology doesn't require which is the network attached storage and a yes Doesn't require anything extra. It's knowledge You can just enable our sync and you can our sync over SSH on the local area network and just move stuff across But that will not be it won't it won't be encrypted GR sync is a simple GUI for our sync and our sync is actually what's running under the hood in Timeshift you can all or you can use BTR FS now our syncs Brilliance is if you take away the the whole encryption thing it can be compressed It's very quick very efficient if you're using it in the way It's intended to which is running multiple times and just moving over the Delta On the file or block level, but the the downfall of it really is that besides SSH It's not really configured to be interoperable with a bunch of different Hosting systems and that's where our clone another brilliant project comes in So our clone works at the file object level Whereas our sync so our clone is just a file transfer protocol. It doesn't support block level Delta syncing however is This is its intended purpose. It's a it's between Local and remote and the important thing to say here is that our clone does not do remote to remote So you can't use as far as I'm aware and actually I'm pretty certain about that You have to be running it locally on something so you can't you Know use our clone to pull between Google Drive and Dropbox you need to come down to some machine where our clone is running first to make the magic Happen, okay, so it's local to remote or remote local, but not remote to remote It also does support cloud storage backup specific hosting wasabi s3 b2 and That's so that's our syncing our cloud now next we have Borg Borg is has a bit of a cult following a lot of fans of it And I played around with it, but never actually really been an active Borg user Borg is deduplicating It does support encryption as well as compression Again, you can use it over SSH. You can get Borg stuff up to the clouds If you use you can use Borg and our clone actually in tandem And you can create nice backup repositories. That's a model And as I said, it can be used in conjunction and the GUI for Ubuntu is called VORSA Other notable Linux backup GUIs would be duplicity Which is deduplicating as I mentioned RESTIC is slightly more updated as cross platform And you've duplicati which is actually shouldn't be on this list because it is a GUI slash front-end I use the duplicity to sync to cloud remotes Okay, so moving on to the GUI's graphical user interfaces now again This is just a partial selection of what you have on the market per se And I say market because most of these are not in inverted commas because most of these are not pay tools But they are out there. So Time shift is the first one that I want to give a big shout out for it's brilliant Time shift is I'll talk about my own backup strategy before wrapping this video up time shift is a basically a bit like back in time and It's been so long since I use genome Slash GNOME and windows really that I can't remember whether back in time is windows or GNOME But anyway Mac and windows have you know, it's a snapshot tool So file shot snapshots are suitable for easy quick and easy system restore and The difference between a snapshot and a backup is that it doesn't actually copy Duplicate the files, but rather it notes changes to the file. So that's snapshot versus backup I'm a little bit confused whenever I read that because to me our sinks a backup tool but time shift will actually describe itself in its own documentation as a Snapshot tool to me that would mean by extension our sink is for snapshots But in any event it doesn't really matter these that very fine difference to me at least and I think to you If you're an average Linux user because it can be used for just the same purpose So as he says the only thing I've actually had to use in the if you want the very very quick and short Short version of this video on one leg in 10 seconds. It's this guess a Additional hard drive for your computer or an SSD but hard drive probably makes more sense for backup storage if you're running a desktop a push time shift on that and There you go. You have now a backup System that probably gets you out in 95 percent of problems So that's what I actually do and that's basically all I've used to have to restore my system So this is incremental as I said, it's our sink the first time it creates a full backup job and then it runs Creates smaller snapshots and those snapshots are just incrementals. So they will basically say this is what's changed since the last time I ran myself now important thing to say here and that's that if you think logically here See the schema on the right where we have a screenshot of my time shift and we have Wm Wd. So that's weekly monthly weekly that daily ignore ignore one of those weeklies there Let's just imagine. I only have 3d W and M. So basically Those are going to be three folders three snapshot folders in time shift And you you know, you can navigate into time shifts directory and check these out and you'll see there are three folders there And each folder has what basically looks like your operating system Each one of those is going to be pretty much the weight of Up to the weight of your whole system. So you can configure our sink to just back up Just the user data or the whole folder or you can say do absolutely everything back up my whole computer Now if you if your computer is taking up your operating system is taking up 80 gigabytes Each of those folders is going to be 80, but each time it runs. It's just going to basically Incrementally Change those three folders, but they're going to each folder. So each snapshot you choose to retain So just when you're planning this out. So if I have a 500 gig hard drive and I'm using 100 gigs So I basically want to make sure I have at least 300 gigs If I wanted to keep three snapshots and it wouldn't be a bad idea to double the capacity of your Of your drive now. It's not really your driver to drive in use. That's what that that is what that is That's what's relevant here, but it wouldn't be a bad idea to even still Do your drive or do your drive times two? or you could just do one snapshot and See how much you're using but bear in mind, of course that as you're operating System grows in size. So too the snapshots will be growing in size So timeshift is brilliant You can run timeshift on a separate drive and this is where if you don't have a separate drive So if you're doing this on a laptop with only one drive, then this isn't really ideal because And this is a limitation of timeshift in that it doesn't support Local or ssh or it's really just on the computer itself So as a desktop user, this is perfect for me because I can just plunk in another drive What you could do instead of timeshift if you're on a laptop and you want to you shouldn't really be Taking your on-site backup But you shouldn't be taking your on-site backup onto the compute onto the drive itself because That on-site has no protection against that disc failure if the disc fails The backup goes with it. So what I would recommend doing Well, actually because there's so many backup tools your options are totally wide open You could use rsync and set up an nas for yourself or you can use a You could plug in an external hard drive You know one of those plug-in ssds a passport or whatever whatever they call these And you could just run an rsync onto that. Ideally you want to do this automatically or you can use something like um You can use clone cloudberry and create a backup plan To an external or to an nas would be more ideal because that can be really run automatically So, you know, you have options. Um So you can research those but it can't be run through modes or even over ssh Timeshift does and this isn't really much talked about but it does have a command line interface So that if you can't get past grub and this has happened to me So long as you the operating system is intact and you don't have, you know, total Bedlam in terms of corruption on the disk You can actually get into that cli From the recovery menu and you can actually run Full restore just using the cli. So as I said, despite all these other tools Cool tools and tricks That has been enough to keep my system From requiring reinstallation For the three years since I made this commitment to backup Next to you. I'm going to talk about here is cloudberry So cloudberry is really cool. And this is where what I would recommend this is for backing up to cloud storage and incrementally Is an incremental syncing tool that can be used to sync up onto with cloud remotes Really, so these are the ones I'm just good the the three backup tools I'm going to strongly recommend that you use and that anybody uses actually are Cloudberry Timeshift and clonezilla those are the three that I You can use all the other ones and try all the other ones They might be better for you But these for a lot of people are enough to get you out of problems. So cloudberry will do incremental To off-site remotes you can run at file level or block level can support encryption and compression Uh caveat is that you need to pay for license But uh, if you are interested in encryption and backing up to the remotes I strongly suggest you do pay for a license because Uh, there's just no reason to economize if you're going to be investing in data protection Uh, so you can add remotes choose what you want to back up configure a schedule You can do the whole works basically and get yourself a nice Um backup plan running there Um, so that's clonezilla. So basically it was clonezilla What I would use this for would be if I wanted to back up incrementally I wouldn't really be moving those clonezilla disk images to the cloud But if you just want to If you just want to Do a smart Incremental backup to a cloud storage This is where cloudberry is the perfect tool really For linux now clonezilla is disk imaging So clonezilla is it's norton ghosts and other tools like this a cronus true image or disk imaging tools And this is a separate category of backup all together Uh, and what this means is basically you're not backing up files. You're not backing up data block level You're backing up Hardware you're backing up the actual drive or the partitions on the drive So you need to run clonezilla from a live usb Um, it's totally full backup methodology and it copies over the whole driver partition. So according to the full Uh full incremental differential schema. It's on the far end of the full It just does the whole thing Target can be local or you can actually do a clonezilla. You can do it direct to a remote. So if you were in a You know, if you were running clonezilla from a You know business premises with business grade internet with, uh, you know, uh, one or 200 megabit per second upload speed or greater You could credibly directly Back up a system straight to the cloud no problem I can't do that because my internet is about 150th of that. So that covers the GUIs for Ubuntu I'm just going to talk briefly in this section about cloud backup storage And where you can put your backups in the cloud. So, um, in terms of uh, practically speaking I do put my desktop image in the cloud But it's kind of pointless because, um, I've done a test restore and it does pull in nicely But I can't conceive of any time where you'd actually want to use it in the first instance I would use my, um, I'd be using time shifts, which is what I use all the time Um, all the time I've needed a backups if that really really failed and my computer was in such a bad state that the disc had failed or something I would go back to my latest clonezilla image and I can't really think of a time where pulling from the cloud would really make, uh, would make a lot of sense, but, um It's it's there and you might not you might want to just back up to the cloud and back up to clonezilla That's something you could do. So you could just do your daily backups to the cloud skip the local backups paradigm completely And then have a clonezilla or, you know, something like that or a full backup on an NES Just as the kind of harder backup approach But if you are storing stuff in the cloud and probably it probably won't be your operating system It's more likely to be your user files. Then, um, basically, you know, cloud storage is an obvious place for off-site in general That's because unlike your physical off-site, as I said, you can just store a copy of your, um, computer in a friend's house That requires you, um, updating a tape Going over to your friend's house And repeating that process, you're not going to want to do that all that often probably Um, and you need to physically move yourself, uh, to somewhere off-site. Uh, and there could be disruptions in your area Um, of various types. So, uh, basically, although there's nothing wrong with physical off-site as a backup methodology The cloud is constantly available 24 seven, uh, over the internet and you can do this all automatically So that's something you can't do if you're keeping a backup copy in, you know, the boot of your car, for instance So, uh, you could provision your own infrastructure for backups like rent a cheap vps Or you could rent from, um, uh, from a cloud storage provider So there's nothing you could use google drive or dropbox to store backups But if you're backing up something like a whole linux system, it doesn't really make much sense. They're very expensive, uh, per gigabyte comparatively speaking and you'd be much better off, uh, just in terms of from a cost perspective And in terms of, uh, scalability to, uh, use a either object cloud storage or a dedicated backup storage plan of which there are a few So if you're going to be using remote storage, uh, important also to check, uh, with the Whatever you're thinking about using to check if it's supported as a remote. So for example, if you're looking at duplicati Uh, it supports, uh, in their documentation. You can see dropbox google drive google, uh, cloud storage b2 That's back plays sftp webdap, but it doesn't do pcloud and box.net Um, and that's it and I just just crossed the hour mark. So I'm going to just Finish this quickly with my own backup approach just to show you How a backup approach can work and what you can do to get your own one up and running So as I've said here, my backup approach is but one of many possible ways to skin the cat of backing up a Linux computer But I will discuss it nonetheless Um, just to show just to put this all together and show you what a backup strategy can look like and one that actually does work Um, so I have posted documentation regarding, uh, my backup plans on youtube So please feel free to follow my account there at danielrosiljlm That's short for uh, Jerusalem danielrosiljlm all together in one word Um, so my goal a few years ago and this was my old apartment here with its, uh, fridge right next to my computer Was just to find a totally, uh robust way to back up the linux computer under this table Uh, so my goal as I've talked to is never to have to reinstall and it's working. I think I've covered all this I'm just going to skip through here So basically here is what I come what I came up with and this is I think a pretty decent approach for backing up a linux desktop For a laptop, um, what I would do is as I suggested before I would swap out time shift with something incremental Also using rsync and I would just change to a nas or worst-case scenario to a plug in a hard drive So you could just run rsync manually you could run grsync and you that would be kind of manual You need to hook up your um, hook up your external hard drive and our ssd And run that process, but it would do the trick But it requires human effort, which is never a good idea What would be better is if you could install a second? I don't I don't know a lot of it laptops So I don't know if you can install a second Uh drive if that's even possible on a hardware level, but if you could I think I've seen a couple of uh of How to saying that's possible then you could uh do that that would be much better And then you could just use timeshift and back up to the second drive so um for timeshift What I do is this incremental onsite daily backup. So As I said before I threw in a second internal drive and I just do Timeshifts restore points bear in mind my caveat that each restore point is going to be equivalent to The size of your system in use and can be expected to grow over time or will grow over time as your primary system does So, uh, it's a good idea what I would recommend is if you had let's say I don't know a 250 gig SSD as your main drive or a 500 gig why not you know, why not be generous throwing a two or four terabyte? hard drive and um You can throw in as many Snapshot restore points As you feel essentially you will not be constrained in any way. So um, that is That's something I was that's what I've done essentially. So the drive is currently the same size But uh, I could have gone uh more intense in that respect So this is my primary day-to-day means of data protection as he says I haven't actually acquired anything else. So that's pretty cool. Um Clone zilla. So here's an nas and the diagram you could also use a Uh, what I used to do before I got this nas was I used to just use another internal drive Or you could use a um, you could just connect a hard drive and an enclosure that would be fine too And this is just a backup to the backup essentially. So just remember that so I I do this less regularly I think once once a month is perfectly sufficient because this would only be needed in the event that the Time shift somehow wasn't good enough or so for example an instance where that would be required If the uh disc fails, uh, in that case, I would either need to buy a first day need to buy a new disc I would need to reinstall a bunch to Reinstall timeshift and then hope that the restore My old and I haven't tried this but the restore point From the backup drive would um, you know work with the new timeshift Um, and that would be we could just pull in. Um, it could just pull in the the data But I wouldn't be overly confident about that. It would probably be easier In the event of disc failure just uh, just to go back to and that's pretty rare We're talking about something that might happen once every few years. So it's not the end of the world to have to Lose I think a few weeks of desktop data potentially if it's only that irregularly Um, so I think once once a month as an rpo is quite reasonable here So what I would do is just once a month um, I run clonezilla and just back up the uh do a drive to image So I literally back up The entire drive clonezilla compresses that to an image and that thing goes on to the nas Over the land. It's a bit slow. Um, if you just use an internal another internal drive in your computer So throw in another drive. It will be quicker, but uh, if we're worried about stuff like, um Electrical surge is damaging. So it's a small bit less safe Um, and you don't have raid. So if you do have an nas It's kind of makes sense to make use of it just to get that added raid data protection But raid of course is not equal to backup um So the second methodology if you're just doing it internally that's going to be faster for you because Your say to transfer speeds just running directly through the motherboard are going to be faster than what you're going to get Over ethernet even if it even if that's just on the local network So the advantage of backing up everything including cloud data So if you do back up your cloud data too, and I highly recommend you do and I do this Everything for me comes down onto the nas So that's kind of a bottleneck in the diagram if you want to think about it that way And then to get my offsite copy of clonezilla. I just need to back up the nas And I'll back up everything on one shot. So that's what I've started doing I used to keep separate drives for the nas and for um, my desktop and then I realized it didn't really make sense and it makes more sense even even though It's a bit slower to clonezilla onto the nas versus to do it onto a um A drive connected to the computer Nevertheless, um, I think it's worth it because it just removes a bit of complication So that's backup one Back of one was timeshift backup two is clonezilla backup three is the offsite clonezilla And I talked about my Synology nas here and that is essentially what I'm doing So I'm just basically using hyper backup, which is one of the tools in Synology's dsm to copy the nas If you don't have an nas or a Synology nas and you're not following this then What you can just do is in in a nutshell you you can just create another backup of clonezilla So you know, you don't need to complicate this you could um Back up one time on to let's say an external ssd. Okay That's your onsite do that once a week and then Sorry once a month and then once every six weeks then uh by go out to a computer store and buy yourself another external ssd and backup on to That and then you can store that somewhere off site. So if you have an office um, what you could do is uh run through this procedure every six weeks Bring your ssd home from your office Write updates update it with another backup bring it to work the next day You're vulnerable You don't have an off site backup for one day when you're when you're between the time when you're you know, leave your office and come back um You can also rotate disks As another option to get around that problem have to and keep them in rotation But that's really a small a small detail. Um, so that would be another easy way to do it is just keep your two disks Sorry to keep to have another external ssd. Keep that off site and then just rotate them And that way, um, you will have an you will have an off site. You don't need an nas for this. You don't even need a You know, you just need two External drives and somewhere off site that could also be it could be your office It could be a friend's house, you know Somewhere you can access repetitively and reasoned with, you know, with reasonable ease Ideally somewhere really not dependent on another person. So if you have If you rent office space in town, if you work for yourself and you can just uh go into the office once every six weeks And uh bring in the disk. That's better than having to rely upon work Um, but either it should be reasonably okay. So this saves me from having to run the clone. So that that's the way I do it So that I don't have to um because this hyper backup runs automatically So I don't need to actually run cloned zilla two times Which you would do if you were uh manually writing two cloned zilla backups onto two different pieces of storage media As I said if if you don't have an as you could do exactly that you could just run it every every month onto your onsite Um, and then every two months onto another drive and bring that back and forth Uh to your office or to your uh, if you've let you rent a locker in the bank The options are are manifold many ways it's going to cut You could alternatively push each clone zilla backup up to cloud storage using our clone. So um Yeah, that was um it with the probably I don't think you can get us Um Data block even though if you name the file name the same I don't think that'll work. So if you if you give the backup archive the same name and override us in clone zilla I don't think our clone will just put up just push up the delta Uh, but if you have a great home internet by all means you could just you know shove it up to the cloud every uh six weeks instead of uh writing it To a physical tape and bringing it somewhere off-site. Uh, that would be preferable, of course So if I when I finally guess good internet, I even if that means a 10 hour upload process I'm going to be uh pushing it up and even I'm sure that'll incur more Cloud charges. That's totally fine. It's me. I will be migrating to that approach as soon as it's practical for me So the point here is any backup any method would work There's many ways It's going to cut the important thing is that you take backups in the three two one manner One copy onsite one copy off-site and as I've just said it doesn't really matter in that respect If it is cloud or physical the more important thing is just that you get it um off-site So uh backup for here for cloud berry off-site. So this is if you really want to go um You know round it out here And this does this allows you to do a cloud a cloud incremental Desktop backup to cloud storage irrespective of whether you've got great internet or terrible internet because it's incremental So it's only syncing up the changes. So a good tool for this and this is why I showed it is cloud berry um If you're doing this bear in mind that you are going to be creating three copies of the same data, right? We're going to have our Uh time shift running on our local computer. That's one. We're going to have our clone or uh, we're going to have our clone zilla Running on to sorry, this should be this should be this should be four actually four copies We're going to have our clone zilla on to our local nes or external drive. That's two We're going to have our clone zilla on to our off-site drive That's three copies of the data and this would be copying the data for a fourth time if we were to Also incrementally. So I think that's a bit excessive um And I say it's not needed because to be honest as I've explained before I can't think of a if ever of a really sensible instance where I'd need this Time shift has an rto of minutes. That's quicker than the cloud Uh and cloud berry, uh, sorry that that should read clone zilla clone zilla works disc to disc um So I wouldn't need if if the timeshift didn't work and I really needed to get a bare metal recovery in place Uh, I would do that through cloud berry because you know to get sorry through clone zilla because to get clown Cloud berry restore operating if the disc fails. I'd have to not only reinstall the disc I have to reinstall the operating system. Uh, that's the operating system layer I ended up to install the application layer, which is cloud berry So a lot of work, but if you do want to cover all bases by all means do that So I hope this intro to linux backups has been useful Uh, in summary the three two one approach is really the key thing um The differences between the various different backup tools and duplication versus incremental matter less really in my opinion than Getting the fundamentals in order that are you keeping one copy off-site and one keep one copy onsite There are many ways to skin a can as well as a cat apparently that should say cat Choose whatever works best for you. Um, encrypted versus non-encrypted incremental versus differential if you do have any questions, uh, please feel free to reach out to me on github That's danielrosiljlm LinkedIn medium, uh, I've also got a contact email listed here on youtube So thank you for watching this video and to your backup success