 Hello, good afternoon. I'm Kern Sibble, and I'd like to talk to you today about Bacula, which is a network backup program. I have to apologize because I'm running on a rather old machine that's 700 megahertz and has only 256 megabytes of memory. And meanwhile, the SUSE Zen Updater is eating my whole CPU, which you can see that down there, sort of poured the left. And so everything's going extremely slowly here. Hopefully we'll get the presentation up shortly. I tried to kill it off, but I didn't succeed. It's still doing its thing, but I think we'll be able to get on with the presentation. Okay, here we go. Before I start, could I see a show of hands of people who are already using Bacula or who have tested it? Oh, well, thank you. A good number of you. If all goes well, again, I'll be able to give a live demo and the Updater will have stopped his things and we can show you exactly what happens in real time and perhaps I can even show you some things that you haven't seen with Bacula yet. One of the Bacula developers asked in a conference one time, well, what do you do for backups or do you do backups? And you got all the answers that you see up there. And even if you answer, yes, I do daily backups, a good backup program, Bacula could potentially resolve a lot of the problems that you have. It can easily help you find and restore files through a GUI interface. You can restore to any point in time, providing, of course, you've made backup at that point in time. It helps a lot in knowing what you've backed up and when. And that's not so easy if you're using tar or some other system. And finally, well, not finally, but another thing is if you have a fairly large shop, Bacula scales very well. In fact, there's one company in the United States that's backing up 2,000 machines with Bacula. And finally, Bacula has a bare metal recovery, which can help you in a total disaster. Bare metal recovery is a non-trivial thing and hopefully someday we can get the Linux distributors to cooperate a little bit more and get a better recovery disk for everyone. Bacula is a network backup solution, which means everything goes on, is communicated from the various demons via the network. It's designed to be portable to many operating systems, and I'll give you an example. Originally, the project goals were, as you see up there, and they still hold pretty well. Today, it runs from PCs up to mainframes and somebody has told me that he's writing a client for a palm, which would be interesting. We already have a lot of enterprise features, some of which I'll describe to you in a few minutes. And probably what should have been number one on the list is we would like to make sure that providing you have the right hardware and keep a copy of the software around that Bacula can read your old backup takes for at least 30 years. And the last one, we have a free and open source license, a GPL version too, with a few clarifications that just help explain things. Bacula is a name that John Walker, one of the original founders of the project, with myself, devised. And it comes from backup and Dracula. And Dracula comes from the fact that Bacula comes in the middle of the night and it sucks the vital essence out of your computer. So it's a simple play on words. I started the project in January of 2000, worked on it for a little bit more than two years before I felt it was ready, and then released at the Sourceforge in April of 2002. And then you see a few other dates there. The project's not large, but still we have a good number of project members on Sourceforge, 41. There's currently 14, one more than is listed here, developers who have SVN Subversion access, right, access. And originally there were 35 developers that had access to CVS about a month ago. We changed and I'll explain the difference there. A few of the mailing lists, you can see they're listed with the number of people that have subscribed to them. And then the downloads for a few of the different versions. The very first version that I released with the public was version 1.16. And I was pretty surprised not very many people downloaded it. Afterthought it's pretty obvious because it's a new program that has a totally different take format. Even though it's open, it's not a typical tar or backup format. However, the last major version before the current one was 1.38.11. And that's the, you see the number of 12,000 some downloads, not huge for most projects. However, that's about twice as what Amanda is getting for downloads. So I'm quite happy. Amanda is supposed to be the most popular open source backup program. Now, Bakula has six components, six major components. And I'm going to go through them one at a time. I'm doing kind of a brief overview because Bakula is so large I can't possibly tell you about everything. The fundamental philosophy of Bakula is that the control is centralized. And the control and administration for the most part. In the director. It has a concept of a job. It's the basic unit that the director deals with, which is a backup of one client and one set of files. And the director does all the scheduling. It has its own internal scheduling. And it creates jobs, supervises them, distributes the output where you want it to go, etc. It also maintains a catalog, which is a SQL catalog. I'll talk a little bit more about that. That allows you to very quickly find where files are and which files you backup, etc. Normally, unless you're in a very large shop, you'll have one copy of the director. Directors don't typically work very well together. The second component is the file demon, or often we call it the client. As its name implies, it backs up and restores files. It runs on each client machine, communicates over the network with the director and the storage demon. And as you might imagine, it needs access to all files on the system. So it must run as a group or a local system or something like that, depending on your operating system. The code for the file demon, and in fact for all of Bakula, is common code. There are little pieces of it, particularly in the file demon, that are specialized to each operating system because there'll be differences even across Unix machines for access control lists. And of course, Windows has its own way of accessing files and permissions. Typically, you have a lot of file demons, for one director, at least one for each machine that you're going to backup. The storage demon is another important component, and it's the piece, the process that manages the media. And it reads and writes to disk, tape, DVD and USB, is what we support now. Normally, the storage demon waits until the director contacts it and authorizes to run a job. And then it'll accept information, data, say you're doing a backup, it'll accept data from the file demon. And if it's doing a restore, it'll send data back to the file demon. If it's doing what we call a migration job, it'll handle reading and writing the data itself without an interaction of the file demon. While it's doing a backup, it also sends the storage location where it put the information on disk or tape to the director, which will then put it into the catalog. So a storage demon, a file demon, don't directly access the catalog. Typically, you have one storage demon per director, but a lot of people run several, if they have several machines with tape drives on. The console's probably the most important piece for users and administrators, because it allows you to communicate with the director and control it. You can do virtually everything in it. If you want, you can start jobs, you can review the output that was produced by each job, and you can query the database. There are quite a number of consoles available. The beat console is a TTY interface, which is very useful if you're SSHE into your site from someplace, and you don't have a really fast communications line. There's a WX widgets that runs on Linux, Unix, and Windows. It could also run on Mac OS, but I don't know if anybody's tried it. We have a GNOME interface that is pretty much deprecated now. It was never developed fully, and then there are several web interfaces, one of which I hope to show you at the end of this presentation, and currently I'm working on a really comprehensive console that is being developed in QT4. It's called the Bacula administration tool, or BAT. One of the features that you can have for consoles are what we call restricted consoles that allow users to see and start or interact only with jobs that concern, that are authorized for them, for example, for their client. The one component that we use that was not written by the Bacula team is the catalog database. We interface the three open source databases, MySQL, PostgreSQL, and SQL Lite. SQL Lite is not recommended really for serious production work, but it's very useful because it's compiled in and has no administration, so it's very simple for testing. I believe Bacula is the only product that uses an SQL database although I've been told that IBM's Tivoli has certain SQL commands. It has the disadvantage that things can sometimes be slow because you're dealing with a big database and it's not a specific code written to deal with Bacula, but on the other hand, it provides enormous capability for users to interface to the information. The catalog database allows us to track jobs and figure out when they were run and particularly where the files were stored on the tape or the disk, allowing rapid restores of your data. And as I said, you can either through Bacula or through programs that can write yourself query the database and get a tremendous amount of information. Bacula, the director in fact, has built into it pruning of the database records via retention periods that you set yourself so that the database, once you've reached your retention period, typically six months or a year, the database will tend to remain stable in size. And if you are doing scaling with a huge number of machines, Bacula will interface to several databases, although we only support one database of any kind at a time, current time. And then the last component that I won't talk about much is a trade monitor. It sits in the system tray and you can open it up and you can see real-time activity in any of the demons that you want. And here is sort of a picture of most of the components. The user usually interfaces through the console, which talks to the director. And then let's say you're doing a backup job, what happens? The director will contact the storage demon, say I want to run a backup job. The storage demon will ensure that it's a director that's capable or authorized to do that. It will then send back authorization codes, etc. The director will then contact the file demon, giving the list of files to backup, and anything else it needs. For example, it may say compute, check some codes, S-H-A-1 code for each file, whatever. And then the file demon will contact the storage demon and once they've authorized each other, it will transfer the files. And all the data on the files and what I call the attributes, which is essentially a stack packet that contains the file name and the time and date that was the last changed into that. The storage demon then in turn puts the information on whatever advice you have selected for that job. And at the same time, it sends the file attributes, the file name and the time and date stamp and the size and a few other things, and the storage location back to the storage demon that will put it into the catalog database. And with that sort of esteem it makes doing restore really quite quick. Now I'd like to give you sort of a summary of a few of the features. As I said, the basic philosophy of bacula is that it's centralized. All the decisions are made by the administrator and centralized in the bacula director. Everything communicates by the network so the pieces can be anywhere. It's not important. Normally they're all located on a local network. It has its own scheduler so it can run jobs automatically at night. It can run simultaneous jobs with priorities. It can reschedule jobs if they fail, etc. The restore is interactive. It's always interactive. Typically you'll want to restore a piece of the current backup, the most recent one you did, which may involve many backups, a full backup, differential, other backups. You can restore to prior dates or you can restore just a file or a directory or whatever. Once it's installed, most administration is done through the console. Very simple. Bacula all takes all this, all volumes that bacula uses have their own, bacula own label so that bacula will never overwrite data accidentally. You could, of course, another job and come along and open the tape drive and write on the tape if it can, but I don't have any way to stop that. It supports NCIM labels, which means that if you're running a big shop, bacula will coexist with other programs and gigantic auto changers. The format of the volume where the data is stored is bacula own format, well documented and extensible. We support Unicode for Win32 and UTF-8, which is the same on Unix machines, so there's no problem with virtually any foreign language. There's a Python interpreter, if you configure it, built-in to bacula, which means a user can get control of strategic points during a backup and carry on various things. And as I said, there's a rescue CD wrong. Backups can obviously span multiple volumes. You couldn't imagine a backup program that doesn't allow writing across multiple volumes. You can put any number of backups on a volume. They can be backups from multiple clients, from multiple operating systems. Bacula doesn't care. The story of demon doesn't care. It knows very little about the information that's going on to the tape. Most tape drives are supported because we have a configurable device driver. Actually, most drives that you'll find today, modern drives, work out of the box because the default configuration is very simple and works with almost everything. We support auto changers with barcode readers Bacula has very extensive what's called cool and volume library management. I won't go into the details of that, but it's very important if you're running a large shop. And I've mentioned several times that it restores a great rapid. Well, one user told me that he used to use car and it took him four to six hours to restore files, space down the tape to the right place and then restore them. And with Bacula he said it works in three to four minutes. I think it's pretty good and it's probably a slight exaggeration, but you get the idea. From the very beginning I built in as many security features as I can because after all you have a demon sitting there, a service that's listening on a port. It's a file demon. It could potentially transfer all your files sometimes. Well, all the demons authorize each other both ways with a cram MD5 method which means that they have shared password, but the passwords are never transmitted across the network. There is one password that is transmitted across the network, but it's a temporary password and I don't think there's any problem with security there. If you want both the director and storage demon can be run as non-root as well as your catalog database. You can put signatures, hash codes, have Bacula create hash codes for all your files. Every block that Bacula writes to takes has a check sum on it. You can restrict user access through consoles by giving them different passwords and different authorizations of what they can look at. Bacula has built-in communications encryption. If you want to work with something across the internet, backing up across the internet you can, but you have good stable communications links. And recently we also added data encryption as a part of Bacula. And then obviously if Bacula can do hash codes we can do a sort of like a trip wire intrusion so if anybody uses the package that was discussed previously to break into your machine Bacula can potentially tell you that your files were compromised and help you get your files back. Sort of the other end of the spectrum. You can see a few of the systems listed up there were Bacula runs where the client runs. The director and the storage demon currently run on all Linux varieties, all the Z series and on Windows. Some of the other systems I've never tried building them. I imagine it can be done. It has a disk pooling capability which is very useful so that it buffers the data coming in from the clients on disk before sending it to tape so that the tape doesn't start and stop all the time and send it in big blocks and avoid the shoe shine syndrome. You can back up POSIX access control lists if you want, Mac resource forks and all of Windows permissions. From the very beginning Bacula supported large files and 64-bit architectures. It's been from the beginning also multi-threaded using P-threads. Originally it was written in C but after a short time maybe a year I converted it to C++ but it's a pretty small subset of C. It helps us generate code that's a little cleaner. Now I'd like to describe what's probably the hardest part of getting Bacula running and that's the director configuration file. There is a tutorial, an example that runs out of the box that I recommend anybody that's starting with Bacula run but I'll give you a rough feeling for how it works. Each section of the configuration file and each of the demons they all look very similar as what is called a resource and the first example up there is a director resource and the second one is a console resource. The director, when it's in resource, when it appears in the director file there's only one director and it means this is who I am. The director's name, it tells him his name and it tells him where he can get scratch file space, temporary file space, how many concurrent jobs he can run and it has a password that allows consoles to log in. If you want to have restricted consoles, you can define a console resource and then you see there, I'm going to go into the details, some of the commands, in this case the only thing you can do is a status. That's all that user can do and typically of course you'll have a different password. Now as I said, jobs are the basic structure that Bacula deals with and it takes a unit and each one has a unique name. It has a type which tells it what kind of thing to do, a backup, a restore, etc. Then it has a level which gives you more detail about what the job's doing, a full differential or incremental backup, etc. It has a file set which tells it what set of files to backup and in the case of a file set it's a resource that could create one resource that's used for all your machines or you could create multiple resources that differ across the machines. Of course you have to tell it where to get the files because it deals with a single client for a single job and where to put the files. Then there's also a thing called a pool which gives it access, it says use only volumes that are within that pool of volumes so you can restrict what takes or what disk it uses and then finally it needs a schedule if you want it to run automatically. I'll give you examples of a number of these. Here's a job, as I say, it has a name. This job is called a dummy name. Theoretically it backs up my laptop. It's a backup job. The client is on this one, I have a different name but I like users to use laptop-fd because when they send in bug reports I know that it's a file deamon. All the messages from the various deamons look similar but by post-fixing durfd and sd we can easily tell what deamon generated the messages. The client file set schedule and storage messages in pool all refer to other resources which are written much the same way but expand out to have a lot of different information in it. For example, here's the client resource. The key thing here is the address of where the client is. This is a domain name but you can put in an IT address. Either IP4 or IP6 if you want. Also, since we support multiple catalogs you have to tell it what the catalog is and you have to tell it the secret password so that the director can authorize itself on the client. The client will let only directors that it knows about enter and only if they have the right password. The file set is fairly complex. It allows you to include exclude files or directories using regular expressions, wildcards and can be very complex. It allows you to turn on compression and various other features that you can see listed there. I'll give you a real example which is here. A little bit complicated. The main thing is as with all other resources it has a name, this one's called full set and then if you look down where the files are listed you'll see Bacula normally does not transfer across file systems. It drains itself to a single file system. That's the prevent recursion. You can turn it on if you want. So on this particular system I have three file systems. The root file system, user and var. So I tell it to back up those three. I also tell it to exclude a few files that I don't want it to ever try to restore. Trying to restore a prop or sys won't work too well. And then the options. You can have multiple options that apply to the files that it's going to back up. Every time a file sees a file it'll apply those options. In this case it's not particularly useful but I've told it to exclude all files that are .c files and all files that are .txt files. That's what it does. Sorry I crammed all the lines together. Normally I have them on separate lines but it doesn't fit very well on a presentation slide. Bacula has a pretty flexible c-like syntax. So that works. This is probably the only schedule that's ever needed. People generate monster schedules. Very complicated. There's a lot of things you can do. But this one is what I call a weekly cycle. Maybe it should be a daily cycle or a monthly cycle because it does a full backup on the first Sunday of the month. Then it does a differential backup which backs up all files changed since the last full backup. Every Sunday on the second through the fifth Sunday of the month assuming there are five Sundays if there aren't it doesn't do that one. And then every other day of the week it runs an incremental backup. That's what I use for all the backups. The only thing I do differently is each machine I stagger the day on which it does the full backup and the differential backup. So one machine I run it on Sunday, one machine Monday, one machine Tuesday, et cetera. And that helps spread the load over the week because a full backup will generate a lot of data. Okay, that's the end of the director configuration file. Now I'm going to show you a file deeming configuration file which looks much the same. And this is with one minor exception this is about all you need. There's very little that you have to change the name of the file deeming and you tell him where he can get working space and then you give it the name of the director that is allowed to enter that file deeming. You can have multiple director statements which means you can have multiple directors that access that file deeming if you want and for each one a password. Very simple. And in the storage deeming likewise it has to know who it is and then it has to know what directors can contact it. Since file deemings never contact a storage deeming it doesn't need to know about the file deeming because it always contacts them. So in this case I've defined the name of the storage deeming and I've defined the name of the one director with a password that is allowed to contact it. Now the storage deeming you have to find devices so it knows where to put the data. In this case I've defined a very simple file device you see a few of the directors that you have to give it to do that basically you've provided a path to a directory that already exists where it's going to put the volumes the volumes will be named with the name that you give them and away it goes, it's real simple. Something a little more complicated it's an auto changer we've defined the auto changer at the top the name that you give to that auto changer is the name that will be used inside the director to reference this thing and I've specified two drives drive 0 and drive 1 so this auto changer is an auto changer with two drives to read or write takes simultaneously if you want and then I've defined one of the devices as you can see here I give it a drive 0 is the name and the archive name is typically the name that in this case that Linux uses to access the drive very simple. There would be another similar name with a few different changes so that the director, the storage deeming knows how to reference it. Okay, that's about all I'm going to tell you about the actual working and back to levels not much, but hopefully it gives you kind of an overview of it with a concept of sort of general idea of configuration files I have to say most developers don't think that backup programs are very sexy we have a lot of trouble attracting developers and maybe they're right one thing I can tell you backup programs are extremely complicated for example we're dealing with databases SQL, we're dealing with GUIs of all kinds we're dealing with web writing a GUI web interface that puts up information from an SQL program databases non-trivial we deal with networking and it's critical to this application because we have to get the data across the network very fast we deal with OS at a very low level as well because we have to basically back up and restore every bit on file system and that's not so trivial and then there's all kinds of algorithms for example let me give you one quick example the directory tree when you do a restore it creates an in-memory directory tree there may be something like 100 million records that come in if you backed up say for six months with one full backup a bunch of differentials and some the I lose the word now anyway some other backups and so 100 million records may come in you need to start them out and reconstruct the directory tree which is not so trivial well the original version that I wrote was a linked list I created a directory tree just recently I converted it to a red black binary tree a non-trivial program and it speeded up the creation of this in-memory directory tree by a factor of 513 so algorithms are very important in this process and everybody wants to know where we're going the best thing is to look on the website key things are they want to be able to restore files very accurately which bacula doesn't always do every time it backs up a file if you delete it and you do a full restore you'll restore that file is the problem that virtually all backup programs have that are based off of time and date what the bacula is you see unfortunately I can't I don't have the time to go into the details because I'd like to show you a demo the main thing to remember is the website www.bacula.org because and you can see a little bit about how things are developed we have a subversion database a subversion repository on www.bacula.org and all patches that come in I personally review them the developers commit themselves and I usually look at their code but they're very good about it so it just works by itself I'm not a gatekeeper though the license is a gpl-2 and one of the things that we did recently was transfer the license via the fiduciary license to the free software foundation Europe so the copyright is now maintained by them which is a big load off of my back because they maintain all the paperwork and if there's any violations of our license their task force very generously will protect us they have the lawyers and they have the resources needed to make sure it works and since this isn't a commercial venture I have no need to have private licenses again the key thing is to look on the website www.bacula.org everything is there I can tell you the one thing you don't want to do somebody told me don't try to print out the manual it's 800 pages it's a good resource and you need to read the manual but don't try to print it out unless you know what you're doing and then also this presentation is there as well if you just go to the right links now that terminates this part and I'm going to do let's see how do I shut this down in show here is where I'm going to do something I should never do and I'm going to try to run a real live demo I think I have a few minutes don't I can you still hear me difficult to I'm going to start everything from scratch so I've started up my sql and I've started up Apache server you don't need to see them there's nothing really super there and then I'm going to start up just a copy of backlit it happens to be a development copy so I hope it doesn't do anything wrong being on print stage I can't even type I'm going to start up Firefox normally I use comfor because I run a desktop but for some reason it doesn't paint things too correctly and it works on Firefox now unfortunately I think I mentioned in the beginning this is a little machine old machine, 700 megahertz 256 megabytes of memory so everything runs slowly here is a picture through the web server I just pointed directly at the local host here that shows what went on for a week I think it was on Thursday morning I copied down from my site my own production site and you can see the different clients different jobs that ran here there were not a lot of them running because I've closed I've been away from home for a month and so I've closed down most of them but you can see the backups there that have gone on it shows you when they started the number of files they changed for example Rufus is my developed machine even though I wasn't there there were 123 files modified that were backed up and then there's some statistics here what I'd like to do if it works I'm going to just minimize this and I'm going to so you can see here as I'm going to run as non-root here's the console and unfortunately it all connected and I'm going to just run a job that I've sort of trumped up here that's going to back up this particular machine one directory not a lot of files I think we'll see and it's going to back it up to disk rather than take the backup jobs you saw there back up to LTO2 but I don't have an LTO2 I could have pulled it back up I could have changed all the information you see it running real time I think it takes a minute or two and then what I hope to show you is in the web report show you that it actually sees that the job ran and show you how you can see the job output for that job as well and then I'll show you if you're interested a few of the other features how much time do I have? so as I told you this is a slow machine normally this runs on my desktop my developed machine in about 10 seconds here I think it takes a minute which seems like an eternity when you have 300 people looking at you anyway I can see one thing Bakula does is when it gets really active it can use your CPU a lot for me it never disrupts me using the system but I'll tell you it's something you do want to run at night there you see the output and I'll go through this just a little bit more but in the other one oh yes I forgot to skim down the file so we back up 81 megabytes here 5,500 files in 1 minute 14 seconds here it is this is Timmy this is a new entry and as you can see it ran on the 25th at 1249 and if I click here it's the only one that I can click on because it's going to go off and talk to it the rest of it the rest of the demons are not online this is the job output that I just saw but it's pulled it from the catalog so for all those jobs this information is saved in the catalog and the same thing you see it's 81 megabytes was backed up 5,500 in one it's exactly the same output the output within Bakula you can direct either by email to a log file to just a disk file to the catalog you have full control over that you can send it in multiple places to multiple different people depending on exactly what kind of output it is if I come back here and I click on clients this shows a list of all the clients that have been backed up over the period if I show you a list of the last jobs the jobs that ran this week theoretically you can click on anyone and get back to the information media I'm going to look at all media and you see these this is where my backup normally goes into those LTOs it's on auto changer dual drive auto changer and you can see two LTO 1 and LTO 3 and you can see when they were first written and when they were last written there's a huge amount of other information that you can pull out if you want the current one is working on his LTO 4 which is still in M mode and as you can see probably by now it's closed that one out and started up a new one if I had access to the real data and finally you see down at the bottom instead of an LTO and that's where I did the current backup I've done a few test ones so the volume bytes are much bigger than the 81 megabytes or 89 megabytes that we actually did about it I think there's a lot more features I can show you but you get the idea and hopefully in the near future we'll have an interface a GUI interface that will allow you to go in and really look at all parts and interactively modify the configuration and whatever so thank you Schedule 4 well the question was is there any information on the schedule for the BAT interface, the new one I'm working on and one of the things that I say I'm an open source developer I'm not paid anything so I refuse to set deadlines however I think it already exists today and if I can I'll just launch it here it it doesn't do anything not much of anything but I think in six months it'll be in good shape so if we're lucky there you see a rough idea of what I'm planning on doing with things such as restore and run a job which brings up a window much like the one you saw and the idea for this will be that there'll be all sorts of resources listed down here that you can click on for example if you click here it's a different way of doing a restore than here and you can have multiple consoles which means you can connect to different consoles if you want so probably six months did you have a question yes the question was if you use disk storage can you say use only this much of a disk yes you can define how many volumes you want to create in a pool and you can find a size for each of those volumes normally there's a default size that comes with the pool which means if you don't fix a size your disk file is going to grow and grow and grow but you can fix a size and then after you've created the disk if you want you can manually adjust it to whatever you want so you have a lot of fine control to create a snapshot before it does a backup is that your question can bacula do a snapshot before it does a backup on Linux systems no because no one's really asked for on windows systems yes on the current version the default is to do a volume shadow copy providing the system supports it which is a snapshot that's done in many changes to the system yet put in there and bacula backs up from that bacula has a mechanism by which you can run a script file before a job and a script file after a job and during those script files you can do anything you want in fact we're working on so that if you really want you can run a script that modifies what files bacula will use to backup so that in that sense users can control their own files I don't particularly like that idea but we're going to try to provide all those main features there's a question here yes Linux has things such as LVM that really accomplish the same thing the question is what kind of compression are we using normally bacula doesn't do compression you we rely on hardware to do it tape drives have very good compression however for disk files bacula can do compression on the client and it's zip it's gzip in fact you can set the level between 1 and 9 but that's the only one we have currently question? you can use your? no the question was the bacula authorization use a standard method and the answer is it uses standard algorithms but it has its own communications protocol which is a block protocol at first it's not a stream protocol it sends blocks of data and second it's its own code so if you want to authenticate bacula a demon with it if you want to communicate with it you need to write some special code but people have done it already from curl and it's very easy to do the question is have I considered her boss? no if somebody wants to implement it and send us a patch why not I put in a simple authorization method that works pretty well and no one is complaining about it one thing is if the authorization fails the authorization attempts for 5 seconds so if you try blind mode beating away at it it will take you a long time before you get in how does it handle? how does bacula handle open files on windows it uses volume shadow copy which takes care of the problem sort of like a journaling system once the snapshot's been taken from the file on linux we don't do anything as it was mentioned you can use lvm or in most cases once you open a file in linux yeah you can grow somebody can shrink it but typically it doesn't get pulled out from under you like it could on other systems and so the backups are pretty good I'll tell you if you ask it to if a file has changed during the backup process that's not actually in a release version and when it restores a file it always checks to see if it has the same amount that it restored the same amount that it found at the beginning of the backup so at least you'll know about it how does bacula handle scheduled jobs where the job fails the client machine is down there are directives within bacula that tell it to re-dry the job x times where you define x times after a certain interval that you define and so there's still other features that we want to add in the future for example if a client if a bacula gets stalled for one reason or other an operator doesn't change a take let's say and so a bunch of jobs are sitting there waiting for the take drive and then 48 hours later there's a new set of jobs waiting for the take drive well they all run and so at some point we need to make bacula smarter there's a full backup running and then an incremental backup comes along and probably we should terminate the incremental backup but that will be a user option always sorry if you lose your sql database is it easy to recover from oh it's not very pleasant because there's a lot of information in there particularly the most important information is what jobs are on what volumes and what volumes you have all the historical information if you are careful you wrote what I call a bootstrap file you did a backup if you use the standard bacula out of the box every night it will come up and it will do a backup and then it will back up the catalog and in doing so it will create two files for you one is a bootstrap file for the backup which is sort of like a condensed ASCII file that you can feed to the restore process that will restore all the files that were backed up the other one is the same thing for the backup that you did of the catalog so if you have that bootstrap file you can go and get very quickly your catalog okay there's also another method which is really painful and a lot of users use it there's a thing called beat scan where you point it at a set of dates it takes or a set of dates if you want and bacula will read through those bacula takes and it will reconstruct a database it assumes that you've created a dummy database at the beginning and you should backup every night your database along with your data you should also backup your configuration files because they're not always so easy to reconstruct or fine okay bacula version 2.0 which was and above actually we're up to 2.2 now the 2.02 can do what we call migration which will in some cases read the data on one volume and transfer it to another volume and so you can do this to take migration that way the question was with different retention periods okay well the current migration deletes the information in the original file the second file that takes on the characteristics of the job it looks like it ran at the time the original job ran et cetera but it's stored on a new volume and so it has all the retention periods and times from the volume you put it on not the original one so it follows logically talk a little louder the question was can bacula reconstruct a full backup or the current state of the system from a full backup and then incremental backups yeah yeah he's asking if you can do a full backup once and then thereafter do incremental backup the problem with that kind of a scheme is that you have a full backup if you did it a year ago how many incremental backups you may have hundreds and hundreds of tapes that bacula has to read through yeah okay now I understand what you're asking if you can take those old full backup plus a bunch of incremental backups put them together and create what's called if I'm not mistaken a virtual full backup or something like synthetic full backup and theoretically it's possible I haven't implemented it yet it's a rather trivial now that we have migration it's a rather trivial enhancement that we'll be making the answer is no you can't do that today but within six months yes bacula will be able to take a set of tapes automatically you can do it manually if you want but it'll be able to automatically take a set of tapes that represent the current state of appliance let's say and generate one new backup that has only the current files in it as if that were a full backup made today which is at first thought you might say why do you want to do that well there's a lot of shops where uptime on a particular machine is critical they don't want to spend the time they don't want to have a backup process running during the time it takes to do a full backup so they'd be very happy offline on another machine condense the backups into one single full backup yeah he's asking if that will have the same problem with deleted files when you do restore because bacula backs up files on their time and date and in any file that you've deleted before the current time if you do a full restore we'll come back and that is a project that's number one on our list of things to fix it's a bit compute intensive it takes some time you actually have to go access the catalog every time you're going to backup a file or you have to send a full data from the catalog so that it knows what files are backed up and then it can choose and that'll be coming yes it'll also solve the problem yes I think we have time for one more one more what's on the bacula rescue CD the rescue CD is a little different from most rescue CDs and I hope to work with some of the Linux distro suppliers to change that the problem with current rescue CDs is your disk goes down your partition is dead your disk is dead you replace it with a new one identical size well if you're like me and you have five or six different machines or if you're in a big shop with 2,000 of them and half of them are partitioned differently how do you get the partitions back the bacula rescue disk does something well with regular rescue disks it's no help and you're in a man shell you don't know what to do you don't know anything typically they don't even mount the disk for you well the bacula rescue CD when you create it it takes a snapshot of your system it tries to use your kernel so that you're familiar with it and then it takes a snapshot of all the critical data on your disk and it saves that until the CD and so when you start up that CD providing you have the instructions already printed out with not a whole lot of problems you execute a few scripts and you can go out and have it totally repartition your part disk reformat any other partitions and whatever and it's an aid to getting you back the process of recovering a system from bare metal is non-trivial I mean if you have your director your catalog and your storage daemon on the system that went down it's not a simple process to get all those things back out so that bacula can restore you're much better off having a second system with a temporary director if you want and a storage daemon and you then fire it up and you put only a client on the machine that's broken and so you use that to restore all your files that's a pretty simple setup I'll be around so I can answer more questions if you haven't but we've run out of time thank you very much