 I will talk a bit about Shaft, which is not only a movie, a couple of movies in fact, but about some other thing, which is called Shaft, which is a tool we made at the company where we work for a couple of projects, because in the systems area in our company we were making a lot of shell scripts, and we even made some important things which were somewhat mission critical, which were mostly shell code in GNUbash. So we reached the point where it was becoming very difficult to test whether everything was working as expected. So we thought about making something which could help us to test whether everything went right. And so what Shaft, apart from Fanny-9, which is our funky, it's an acronym for Shell Advanced Functional Test. More than functional, it's declarative. And if we wanted to say what it's in, is for shell scripting, it tries to be simple, but it's a bit hairy in the implementation, but this is a lot making test cases. It's somewhat declarative, not very well get this thing of declarative unit testing, but at least it saves some time. Well, a lot in reality. It's for unit testing, as I have said before. And it's a bit strange to talk about shell scripting and unit testing in the same phrase. So I wanted also to highlight what it's not or which not so good parts it does it have. For example, it's only for bash at the moment. I suppose that it would be possible to port it to KSH, but it's not very straightforward. So it's somewhat simple. So if we wanted to do very complex things, probably we would shoot ourselves a set of feed. And you still have to write tests. It's much easier but not totally avoidable. So this is logical after all. And it's not the solution for all our problems related to shell scripting. So, well, I really commented out some things why, some motivations because we started this small tiny project, which is really a side project of other bigger things. And we were using it from one to another. And the main reasons because we did this is because in general people doesn't want or like to write documentation because it takes a lot of time. And something similar goes to unit testing. It's something you want to do and you have to do sometimes because it's good. But it takes a lot of time. So one of the motivations was making it easier so it could take less time. And being more convenient because we have found a couple of ways of making testing in bash, but it was somewhat cumbersome. So we went down in the code and decided to do something about this. And why or what's the reason which could this could be relevant or interesting for someone working at a Linux distribution like Debian or others. There are a lot of them. Well, if we take a closer look at packages and the things they include, we could find something like this. In fact, these are real numbers taken from the latest established Lenny Debian Lenny. And there are some parts of the distribution which are somewhat important and they should not fail. For example, the shell code in the initial run disk used both for the live CD-ROMs and for installations for initial setups and for booting the system. They contain about 1,200 lines of code in shell code. The init scripts, those ones which go in slash, etc, slash init.d, they are about 3,000 lines of code. But it starts to become relevant when you reach the package scripts. I refer to the scripts which are run before installing a package of once the package has hit the file system, for instance, posting scripts. Taking into account typical installation with no X window system with about 200 installed base package, it's about 17,300 lines of code. Well, not all of this code is relevant because most scripts in this 200 set of packages are only one liners or two liners or at most five or six lines. So there would be no need to test those trivial scripts. But there are others which are quite complex. For example, the ones which update the grab, the bootloader configuration or the kernel package contain some longer scripts which could affect the system if the package is not installed correctly or if the script has some problem in its code. And if we take a look at the test file system, this is something which someone in general should not try because it takes a lot of time identifying with file the scripts and counting lines. It's 90,000 lines of code. Of course, not of all are interesting for testing, but maybe there could be something in Devian which could be tested. And the point is that as far as I know, I tried to ask and nobody told me whether there was some way of active testing and automatic testing of distribution scripts. As far as I concerned, there are no testing, no automatic testing, which probably would not be interesting once the code is done because you theoretically have already something which works, but could be interesting for avoiding regressions when making new releases. So for important things, it could be an option. Well, I have already said some things like that. And for example, the Devian installer I was trying to search yesterday whether it was being tested or not. That's why the exclamation mark is there. But at last I have been able to download the source code and there is no automatic testing either in the Devian installer. Well, there are a set of existing tools also in Bash for testing shell scripts. Interestingly enough, the advanced shell scripting guide for GNU Bash recommends using Echo or Print for doing testing, which is tricky and not very good, as you probably know. Then there is set Minosex which will print out traces of every command and shell functions which are being done during the run of the script, while the script is being run. But if you have a complex script or a long script which runs for a while, you will end up with a very long trace log of what your script is doing and it can be somewhat difficult to follow how things are being executed. So in reality, in practice they are only useful, Minosex is only useful for small things. Or at least that's what we thought before, what we thought and when we made this tool. There is two more not very well known Bash flags, which Minosex U and Minosex V. The first one warns about indefinite variables, which theoretically sounds great, but in practice it's not that great. For example, if you use Bash arrays and you have an empty array and you try to determine how much items are in the array, you will get a warning about an indefinite variable. But it's empty, not really indefinite. So this option can be even more confusing than solving things. And Minosex V, what it does is warning about, I don't remember exactly what it does, indefinite and the other one, the other one, the Minosex V, once about variables which have never been defined, which have never been assigned, sorry. So it's somewhat interesting, but the application is limited. And then we have found that there are two things which could be useful, and in fact they are somewhat useful, which are the bugger for going to Bash, which is BashDB. It's a bit tricky to use, but it makes the job. And then it is an Xunit-like small library, which you can source in your scripts to make unit testing. So at first we tried to use SHunit2 before rolling around tool. But as we had a lot of test cases which involved file system operations, we ended up with very long test cases, especially the parts, especially for preparing up the environment and the like. So it was a step forward, but it was not the perfect solution at all. So I would consider the fourth one a failure. They are useful, but they have a very limited range of application. And for the other two tools, well, maybe they could be useful. They are useful, in fact, but not as easy to use as one would expect. The good thing about SHunit2, now that I recall, is that it works with any decent shell out there in the Bash set SH and KSH mostly. So in fact it's quite portable. And if you want your scripts to be tested and used with different shells, I would go for it because SHaft works only for Gino Bash. And you may be curious about how testing is done using our tool. So I will give a quick example to see how the thing looks. Well, it's laid out this way. We decided to build upon SHunit2 because we were already using it in some places. And we didn't want to throw it away and start again. So it started as a side thing, which we were trying to avoid, but we made the most out of it and we ended up building on it. And the BastiBagger, we use it because we wanted to be able of stopping executing in certain places and modifying the environment before making particular tests. And having the BastiBagger under our code allows for example, if you have a script which tries to download something over the network, for example, using Wget, you can skip that particular instruction or some lines and then place the supposedly downloaded file in the file system artificially so you can run the test suite automated without requiring network access, which is a good thing, for example. We have used it for this kind of corner cases in which you need to do something automatically which depends on external things which are not part of your code. So that's why we use it as well. And how this affects the code we already may have in a script and how we write test cases? Well, for being able of stopping at certain places and making tests, we decided to go simple and to put some marks with commands in the code. It's far from perfect because theoretically unit testing says that you should not need to know what's in the code to test it. So we are not really doing perfect unit testing here. But it was very convenient for us this way because we had a lot of pre-existing scripts which were not structured at all. They hadn't functions. You couldn't source them because they had side effects. So it was not possible to import them into another running script and test individual parts of the code. This way we can add the marks and test some parts without modifying a lot the original code. So this was a necessity at first but it turned out to be convenient, in fact. And then one should write some profiles which are a mix of BASDV command script commands and also a set of commands which are implemented in Shaft and which provide additional facilities to make testing easier, which is the final goal of all of this thing. And then write a small wrapper script which imports Shaft and starts running the tests. This may sound like a stupid idea, in fact, because it could be a command line option to specify which source file you want to test and the like. But as we will see later, it allows for adding common environment things for all of the test cases. Well, the marks, for example, this is a very simple script which I made yesterday as an example for today. And just imagine that we wanted to test whether we're entering the if clause. So we can stop here with where test O2 is written with a pound mark and a plus. Or, for example, test whether BASDV actually makes assignments in the expected way and stop after the assignment of H equals 45. So all scripts need very little modification for being tested with Shaft. The bootstrap script which runs the test is also very simple. It's only a matter of defining what we are testing and adding a relative path to the directory which contains the test cases, one test case profile, and sourcing the library in quotas. And we could add here functions, shell functions which would be inherited by all the test cases. This is interesting because a hook function can be defined in each case for preparing the environment and that function can use the things defined in the runner script. For example, I told you before that we were using this tool for testing scripts which had a lot of side effects on the file system and needed a starting environment in the file system. So we defined a common function for preparing this layout in the runner script and then we used it from all the test cases. So each test case only contained actually the statements needed for testing and the preparation of the file system was isolated in this other file and common for the cases. So no code was duplicated. So it saved a lot of time. And this is actually how a test case looks. Well, it's a strange mix of best DB commons, shell functions, and our own commons. For example, gts, a common defined in shaft. And there are two hooks which can be defined here, apart from other functions which will be ignored. So you can define, for example, prepare full and call it from a hook which is called prepare. I have not put a prepare hook here for simplicity. And the breaks hook should define in logical lines of code where do you want to stop. So if the mark used for stopping in this proof is the name of the proof, test01. If you recall, I had a mark over there in the code. So it would stop just after the command which has the test01. I could say there could be spaces after the mark for stopping because the numbers here which define the place where to stop are logical numbers. So spaces and commands are ignored, which is pretty convenient, especially if you are making cosmetic changes like adding white space and the like, which sometimes happens. And so what would this one do? Well, set a breakpoint at the next line of the mark with test01. Start running the thing. There is a reason why there is not an implicit start. And it's because there are commands for defining environment variables, a message which would be printed in case of test filer and some other similar things. So I think it's better this way because if there was an implicit start, one would have to restart the executing after defining that kind of things. So it's a matter of taste, probably. And here, well, the syntax is something like a fixed syntax. And it's this way because the parsing of all of this is done in bash and it's not very easy to make a parsing in bash, as you probably know. So what the last line really says is we're checking that something is greater than zero. And what we are checking is the rest of the line. If the rest of the line prints something which is greater than zero, so the test will pass. It's a bit funky, but actually it's understandable. This is another example, for example. If you recall, I had inside the if block another mark for stopping, which was test O2. And we can print a dummy string because if the script does, the execution does not go inside the if block, this sentence will never be executed. So this way we are checking whether we have entered the if statement. If the statement was not being run, the code inside if, the check would be done even so, but the output would be empty, so the test case would fail. It's a particular thing of testing whether a code path is following what is supposed to do. It may seem a bit strange at first, but once you have your marks set, it's only a matter of having one line to check for each one of the points, you want to know whether the execution was there. So it's pretty convenient after all. And we could even, this is a small sample of how it would look if we wanted to do the two tests in the same profile. Well, it's more or less the same, but we continue executing until the next stop is reached, so it's pretty straightforward. And if we run this, the example run is a script which would stop the thing, as I have said before, and looks something like this. If some test fail, it would be reflected in the results at the end. And one thing which is not here in the slides, I would try to put a big shell here, okay? Well, this is very big. For example, you get for free. You could get for free. Common line parsing for common stuff, like running only a single proof if there are more than one, being verbose. Another thing which is interesting is that if you tell the MinusD switch on, it would create a log file, which, for example, you could ask someone to send you by email, which may be good, for example. If you are delivering your shell scripts to some user, and this will say, hey, your shell script does not work. And you can say, well, run the test suite and send results to me so I can tell actually whether it's your fault or it's the fault of my code. So the logging mode is interesting. You have the logs. And, for example, to check whether it's convenient to define the prepare hook, as I said before, I could edit this one. And add a prepare function here and print out something to the console. So if we run the thing again now, something appears at the console. And also I have said that I could define some particular function in the running script and reuse it this way. This would be where we would prepare all file system layouts, for example, in the case where we're doing some things with the file system. And I could say foo here and reuse the, well, something, foo. And I could reuse the function from all test cases. So it's pretty convenient for testing things which have side effects and which maybe whose preparation is not as trivial as it could seem. And there is another feature which is not on the slides regarding the file system. The tests are run in a sandboxed, somewhat sandboxed environment which runs in a temporary directory under slash TMP. So it tries to isolate the testing from the rest of the file system. So you would not be screwing things in your system. And also if you had different, for each test case there are different directories which are created, I suppose I would have the directories there. Whoops. In TMP there are directories like a shaft example script and they're like, well, it's not very clear. Well, they are here. The sixth. So you get a sandboxed environment per test case and each test case runs in its own environment inheriting the environment from the runner's script. So it's actually quite difficult, I would say even impossible to one test case affecting all the test cases which is good. And also the changes made to the file system by one test couldn't affect the next one. And well, as a quick summary, because there's nothing more big things to say about this, well, the implementation is somewhat hackish and it's something I would like to change, but it's not very easy. There are a lot of file descriptors changing from numbers and going from one place to another because all of this is working with pipes and with filters. So it's not maybe the most easy thing to understand, but for example, if you take a debugger, it's called this also not easy to understand. So I don't know whether this is a common thing of debugging AID tools or testing AID tools of being somewhat complex because it's the first time I do one of those. But I think it's somewhat useful. But even being hackish, they save a lot of time. For example, we were running out of time in one of our projects. Well, I work on the systems area that we had internal project for deploying a set of things. And we were running out of time with some scripts which were failing horribly and we were doing some manual testing and we ended up having good tests and the thing worked after all. And I don't know whether... This is another thing which I don't know whether it would have a lot of application outside of some particular things because it's not very usual to make mission critical code in shell scripts but there exists some. So I don't know whether it would be applicable to a lot of cases but at least for some of them it may be useful. And the most important thing after all is that the name of the tool is catchy. And this is all in general. So I have... We have published the code so you can surf there and take a look at how catchy it is and if you want to ask something or someone which is not here well, we have a mailing list and questions. If there are some. No questions? Well, as there are no questions a final remark before leaving because I should go back home today. I left some t-shirts and some notebooks at the front desk so have yourself and pick one if you want. And thanks for attending.