 Hello everybody. The next talk will be by Lars Svetsynius about back up using opnum for backing up your data. Hello everyone. Hello. So this is a fairly short talk and I suspect there will not be enough time for any interactive Q&A but if you go on irc.oftc.net and on the hash opnum channel I will answer questions there later or you can come ask me questions afterwards if I happen to look like I'm not running away. Many years ago in about 2004 I had a weekend and I realized that I haven't done a proper backup for a long time. I had I think three computers at home all of which contained various bits of live data and it was really tedious to do a proper backup. So I went and bought a stack of CDs and I started running some scripts to make backups and then I started verifying the CDs and maybe one in five of them failed. This was not fun. It took all weekend and this is obviously why I didn't actually make backups very often. Yes there was a time when I didn't do backups very often and then I started thinking about what can I do differently and some people suggested that hey you should buy a tape drive and I said very dirty words to them. I don't like tape drives. Other people do and that's fine. I don't care. I could switch from CDs to DVDs which would mean that the media cost goes up but I have many fewer of them but even then it was clear to me that I would need several DVDs and the number would go up very soon. So I decided to look at switching to hard drives and I did some calculations and hard drives where or had just become so cheap that this was feasible for me. Obviously hard drives have a problem that eventually you need more than one and you start having piles of hard drives but at least it's easier to copy data over to a new drive. You can mount all of them as one file system and then just copy. I did not find a good program for backing up to hard drives. I looked for about two years and looked at various choices and I wanted to host some of those drives in different locations over the network so this brought some complications. So I didn't start out to write a backup program but that was what I ended doing. One of the issues I had was that the existing backup software was written for tape drives and they were treating hard drives as virtual tape drives. Makes no sense to me. There were a few features I wanted and one of them was that I wanted snapshots. I wanted every backup generation, every individual backup run to be a snapshot of all of my lab data and so that I could decide to pick and choose which ones I keep when I start throwing expiring backups. I did not want to have to say restore generation four and then meaning I have to restore first generation one and then apply a delta front generation two and then a delta front generation three and four before I get what I want. I'm to go where I go and that's all I want to have. Since I wanted to host my backups elsewhere, at least some of my backups elsewhere, I wanted to have encryption and I wanted to have encryption that is done locally on my computer before it leaves the computer. There were programs that did this but they lacked some of the other features. Since I'm doing online backups and since hard drives back in 2005-ish were still very, very expensive. That was a joke. I wanted to have deduplication. I don't want to transfer data over a one megabit ADSL line over and over again if it's already on the remote and I want to reuse that bit of data and I don't want to have to spend huge amounts of effort to get this to work. In a previous life in the 90s I was tasked to set up Amanda in our office network and this was not fun. Any backup solution that requires me to read more than one page of instructions is basically lost. I'm getting too old for reading manuals. Obnam does all of those things now. It's taken a while because I write Obnam in my free time but it does this now. It has a few other things that are slightly interesting as well. It has a fuse plugin which means that you can mount your backups and then look at them as if they were a directory tree. Every backup generation is a separate directory and you can look at any of them as you wish. I wrote a manual. It's not one I would read but it has a read this first section that is very very short. I wrote automated test suites which are not the best ever but they catch most of my mistakes and obviously users don't directly care about test suites because they don't want to contemplate the possibility of a backup program ever having had a bug. However I care because this makes it much easier for me to trust things. It also has the world's ugliest website. There's a few not reading your email can go to Obnam.org and Marvel at my web design anti-skills. Daniel, okay I'll fix that. Obnam has some problems as well and I want to be entirely honest about them. One of them is that almost the entire development team is here. It's my hobby project and there are a few people who occasionally send patches or help on the mailing list but it's not a large community. Obviously this is gonna grow now by a number of people. It's known to be very slow. It is possibly the slowest backup program ever though I'm working on fixing that. It's not ready to use yet and occasionally there have been fairly serious bugs up to and including destroying people's backups. My test suite is nice but it's not perfect. Please help. As far as I know all the known bugs are fixed now. Sorry all the bugs that might corrupt the backup repository are fixed now and I fixed yesterday morning before I left for Debcon a bug that would prevent parts of my demo that is coming up. I made a release yesterday. It also has the worst, world's worst marketing team for any software ever. The world has large numbers of backup programs. It seems that every hacker and every sysadmin at some point realizes that yeah they really should make a backup shouldn't they? So they write the wrapper script around our thing and some of them release this on as free software and this is good. I do not want to compare to any other backup program. I don't have the time to go and look at them properly and any comparison I make would be unfair. However if someone else wants to do this this would be a service to all free software users or rather all free software users who have any data they ever care about. A proper comparison would be helpful. I don't care which one you use as long as you make backups and never lose data or rather as long as you never lose data. You should never lose data and don't make any backups good for you. So I'm gonna do a couple of live demos and I need someone to hold my mic. No it's fine. It's fine. So I have two machines here. The left tab the current tab is my laptop. The right tab is my server or rather it used to be my server. There we go. I'm not telling anyone where the server is just in case you go and break it. That would be telling. So the first demo will do what I call a pull sorry a push backup. This is an example configuration file for Obnan. It uses an any format which is common. The root variable specifies where the live data is the data you want to backup where it is. And I'm gonna backup my backup program. Repository is where backup should go. And Obnan can do access the repository over SFTP. So we go to the server machine and put it in that directory. If the directory doesn't exist it gets created. I also want to encrypt my backups using a GPG key. And I'm use setting the source for random bits to dev you random so that the demo goes faster. Some people tell me that this is the proper default anyway. This is how you would use this. You specify the configuration file and you... Second? Good enough. And then you have the sub command called backup. If you push if you put the configuration file in a file called Obnan.conf in one of the usual directories it will pick up that automatically so you don't have to repeat the config. And Obnan takes a while. So I should sing and dance for a while. What it's telling there is how long it has taken, how many files it has found, and how much data it has found. And it would tell you if my font was smaller what file it is currently doing on, working on. Some people think this could be more useful feedback could be provided and that is entirely true but I have never found a solution that satisfies everyone. Come on. There we go. Then it gives you some statistics on how long things took. Then you run the second generation. There were no changes. So this should... Oh, it did go faster. And that's how you run a backup. The fuse plugin is what I'm going to be demoing next. Where you can mount the backups you have made and that's what it looks like. However, I will show you this in Nautilus because that's more graphical unless this is too small. Can everyone see this? So there is a directory for every backup generation. The first backup is called generation 2. The second one is 11. These are not random but effectively weird numbers that Obnum chooses for this. Don't worry about the fact that they are very weird. We can go and open a backup. This is... Yes. So he said that the numbers for generations that Obnum chooses are strongly monotonic and if I understand my maths correctly, it is true. So this is a view into a backup stored on my server which is somewhere unspecified. You can go and open files. You can go and open files with various programs. It looks to the tools exactly like a file system. You can't make any changes. This is a backup. You can't delete files from your backup. And it is a little bit limited in some corner cases, meaning that sometimes programs can get a little bit confused. But what you can do is say that, okay, I want to restore those stickers into my home directory, into the demo directory. And there we go. And as usual with views, that's how you unmount. I have a second config file. This would do a pull backup where it basically accesses the server over SFTP and stores the repository locally. It's equally boring to look at. It takes a while. I thought it went faster than this when I prepped. Must be bad network connections to my server. To Obnum, you specify both the location of the live data, the root, and the location of the backup repository using either a local file path or a URL, SFTP URL. And Obnum doesn't care where things are. Okay, there we go. I could show you how to browse this backup repository as well, but it looks exactly like the previous one, so it doesn't matter. And I'm about out of time, so thank you. Thanks. Is there any questions? Hi, Lars. I'm using Obnum on several machines, and I regularly have the problem that there are old locks that I have to open by force. So we'll run Obnum force lock. There already was a discussion on the mailing list that there should be a timestamp on the locks or somehow information that Obnum can now, when it is safe to ignore those locks, what's the status of this on the roadmap? It turns out that if I put any information into the lock file, it gets encrypted at the moment, meaning that if you have to lock a different client, so Obnum reporters can contain data from multiple backup clients, it gets encrypted with someone else's key and you can't decrypt it and then you can't unlock it either, which is slightly unfortunate. Status is that once I'm done with what I call format green albatross, I hope to start looking at these annoying slightly smaller problems than every backup taking until the heat death of the universe. But certainly that kind of thing needs to be fixed because it's a fairly big problem. Hello Lars. Maybe you said that but I didn't understand it. Do you handle extended attributes in that backup? Yes, I handle extended attributes in the library. That's very good. Thank you. Hi, you said it's slow. Why? I like to make people cry. I'm not so much interested in your personal motivation making it slow, but rather the technical reason how you made it slow, so how did you make it slow? It took years and years of work. No, the technical reason for why Obnum is slow is a case of multiple reasons. One of them is that I'm using the copy and write B3s that ButterFS is also using, is this one, and I have my own pure Python implementation of those and they're in many ways really nice, but I couldn't make them quite fast enough. Need a good ButterFS. Another reason is that I have largely so far concentrated on making it correct apart from, I don't know, little bugs and so on, but handling things like extended attributes, handling things like being able to remove backup generations, handling things like actually being able to simultaneously do encryption and deduplication, and then I thought, oh yeah, by the time I'm done with this, hardware is going to be so fast it doesn't matter how slow I make it. I was wrong. But one of the examples for technical actual examples here is that when Obnum does deduplication, it does this by taking the live data and dividing it up in chunks, and currently every one of those chunks gets uploaded separately to the server. That's a round trip that can be hundreds of milliseconds long, and the default chunk size is one megabyte, so most people have files smaller than one megabyte most of the time, source files. Now, most people have source files less than a megabyte. So does that also mean you can also back up to some other file system using Fuse, and because some backup solutions to it with the hard links, where you can't use a Fuse file system, like DuffFS encrypted already, so I can just give a pass to DuffFS encrypted, and it will work. I'm not entirely sure I understood that, but I do handle hard links, and Obnum makes a lot of effort to make sure that there are no inherent restrictions like that, because I don't want, that's all part of the being correct, I don't want to have restrictions like not handling hard links, or not handling files of zero bytes, or not handling files that end with .jpg, or whatever. There's no point in having a backup program that can't handle your data. Obnum also has a mode of deduplication that basically is never deduplicate, and another mode called verify where it always verifies that there's no checksum collision. In order to accommodate people who do research in the checksum collisions, because it would be really unfortunate of spending five billion CPU years or generating a SHA 512 checksum, making a backup, losing your hard drive, having to restore, and not have a collision anymore. Is it also possible to make a backup, using disk attached to local machine? Yes. Okay, thank you. Hi, you said it supports extended attributes. When you say extended, do you mean as in SELinux levels, SELinux permission things? Maybe. I've never actually used SELinux, but if it stores them using the standard attribute mechanism, as it's used for, for example, SELs, Obnum should handle them, as long as Obnum has permission to read and write those. I think we are very short of time, so maybe one more question. Hi, my question is with the more increasing use of file systems like ButterFS and ZFS that have advanced features, does Obnum, is Obnum able to leverage those features somehow? No. Obnum is written basically to assume as little as it can get away with from the file system where the repository is stored. Meaning, for example, it can use the FAT file system. Yes, you can store your backups with Obnum on a SD card that works in your camera. Thanks. Okay, thank you. And I have Obnum stickers if anyone wants them.