 Okay, alright. So, good morning. We are glad that there are some people here this morning after the cheese and wine party. Anyway, so, this is Nicolas. I'm Ralph. And we are going to talk about, in stereo, in fact, about shell scripts. Shell scripts in general and Debian maintainer scripts in particular. So, let me first start by reminding you what this is. Maintainer scripts in Debian, most of you will probably know this very well. So, this is a citation from our create policy and we will have quite some citations from the policy. So, a depth package contains two sets of files. In general, the first is a set of files to be installed on the system when the package is installed, which we call the static contents of the package. And then you have a set of files that provide additional metadata about the package or which are executed when the package is installed or removed. Among those files are the package maintainer scripts. So, these are the scripts we are interested in and we are talking about today. And these maintainer scripts, you might have up to four of them in any single binary package. These are the pre-inst, post-inst, prm and post-rm. And their function is very roughly, well, when you install a package, then first you execute the pre-inst script if it is present in the package. Then you unpack the static contents, the table, and then you execute the post-inst script. And when you remove a package, it's to reverse with the rmscripts. So, first you execute the prmscript. Then all the static contents of the package is removed from your file system and finally you execute the post-rmscript. And the situation is a little bit more complicated when you upgrade the package or when you downgrade the package because then there's a mixture of these scripts of the old and the new version of the package which are executed with additional arguments passed to these scripts in order to indicate to them what precisely is happening. So, these are the things we would like to analyze formally to know whether they are correct. Before looking more into what this means, a breakdown of what we have currently about maintainer scripts. So, these are numbers from which are more or less recent from May. We had at that time about 50,000 binary packages in SIT and counting all the maintainer scripts we arrived at about 30,000 which means that not every package has them. About these 30,000, well, you probably know that many of or large parts of these maintainer scripts are today automatically generated by the paper stuff. But anyway, we find among these 30,000 about one-third which contains at least a part of the script which has been written by hand. So, which has not been introduced, inserted by some of the debt helper stuff. And by the way, we did some similar stuff, more basic analysis some 10 years ago in the Mankusi project and we obtained similar figures about the number, about the portion of maintainer scripts which are partially written by hand. So, apparently this doesn't move too much over time. So, about 10,000 scripts are at least partially written by hand which means they potentially contain bugs and contain errors. And this is something we should worry about because these maintainer scripts are executed as route when you install a package on any machine. So, we should be defensive in writing these scripts and we should be sure that they are correct. A breakdown by language. So, almost all of them are POSIX shell with a few exceptions. So, there are about 230 batch scripts, bin batch, bin batch. 16 batch scripts. There are five ASCII files. These are the findings according to the file utility and ASCII files means these are shell scripts which do not have the shebang. And we will come back to this case a little bit later. And there are even two ELF executables and these are in fact for the packages of batch and of dash and this is quite normal because obviously you cannot write their pre-inst scripts as a shell script because you don't have an interpreter when you install the stuff. Okay. So, what the policy says about maintainer scripts is, well, they are not required to be shell scripts as we already have seen on the previous slide. C shell and T C shell are discouraged for various reasons. So, there are several papers and documents available on the web which explain to you why it is bad to program in any of these languages. The policy says that they should start on a shebang and indicate the type of the language which is used in the script. They should use set minus E, set minus E, puts the shell into a mode which we call in DBR and the strict mode which is not a posix terminology. The strict mode means that when there is a command which fails then the shell script itself will fail and it will not continue execution of the script. This is what you usually would expect from maintainer script because if something unexpected happens you want to fail it noisily so you see that something went wrong. However, the set minus E, the strict mode, one has to know that it is temporarily disabled in certain situations and it went during execution of a shell script and we will also come back to this. The posix standard for a long time, in fact when we started our project the version of the posix standard which was mandated by policy was an old version and quite recently this was updated so the policy now talks about the version of the posix standards of 2017 which is great, thanks a lot for that. Unfortunately, posix recently updated to a 2018 version so we probably will have to try to update this again. The policy also says that any shell interpreter should implement the posix standard with some additions with some DBR-specific embellishments and this concerns the echo built in. If it is implemented it should support the minus N option. The test when it is in build-in it should support the minus A and minus O option we will also come back to this. It should support local scopes and the local keyboard and we will also talk about this and also some stuff we don't care about in the context of our project because in our project we will ignore concurrent stuff and signaled stuff. Okay, so we will look at posix shells with these DBR-specific extensions. Now what we are trying to do in this project globally is to get a formal assertion of the correctness of these maintainer scripts that is an assertion that they are behaving correctly on a semantic level. In fact, to those of you who have been in Cape Town two years ago at Debcon I gave a talk at that occasion where I presented the project and I gave an example of one of my own maintainer scripts which was wrong and which did terrible stuff because it removed too many files when you removed the package. Okay, so these are the kinds of bugs we would like to find at the end of our project. I tell you right away we are not there yet. So we took a lot of time by working on the front end of our tool chain and this is in fact what we are going to talk about today. So this is about formal analysis. Formal analysis in the sense of program verification, formal program verification, this is not testing. So we would like to get by formal analysis really an assertion of the fact that our scripts behave correctly in any possible legal situation. So it's much stronger assertion than only testing. So possible outcomes where we would probably get an assertion of correctness but there one should also be aware that all one can hope for is to obtain an assertion of correctness in some abstraction of the model of the system since anyway what you have on a real unique system is a quite complex thing and you probably won't be able to model all of this completely in something which can be handled by program verification. Another possible outcome and this is probably something which will be much more useful for Debian is bugs and finding possible bugs in packages and we have already found quite of them for the moment on a quite trivial syntactic level. Okay and this is what we talk about today and the first step to obtain a tool chain which does all of this is to start with passing of shell scripts to obtain a syntax tree for shell scripts and this is what we did first. Thank you. So that was already the topic of a talk that Ralf and Jan, another colleague had in MididebConf Hamburg about why parsing POSIX CELL is hard and how we did it in our project. The problem with POSIX CELL is that it's not designed to be to allow parsing of whole files. It's designed to, you should pass first one complete command then execute it and then you can go back to parsing which means that when you want to pass a whole script you can get into trouble. Also, in order to have nice features like being able to write makedir4 or things like that you have a parser that must be speculative and maybe promote keywords to keywords and try things and if that doesn't work try something else which is also hard to write. Yes. And one specificity of POSIX CELL is also these here documents and the difficulty about parsing here documents is that they are not local things. So first you pass a command and when parsing that command you discover that there will be a certain number of here documents that you will have to pass afterwards and so you have to keep information about what you will need to do after passing your command. And in fact, in general, parsing statically one full script is indecidable because you can have aliases maybe in branching conditions which makes it hard even in a syntactic level to know that to pass the full file. So we have a parser that's called Morbic. It's available in GitHub if you want. It's written in the language OCaml and it uses one parser generator many years. So this is something we are proud of, the fact that we use the parser generator. The usual parsers for dash and bash are in fact pieces of C code handwritten working on a character level but basis. And in our case we were able to use the grammar that was in the specification of POSIX. The many a parser generator is a great tool that allows us to not only write the grammar but write the exceptions in the grammar that are required in order to have these speculative parsing that Shell requires. It allows that by giving you a way to introspect the status of the parser and to maybe update it yourself if you want. That is, you can look at the parser, say, oh, it's starting to pass, I don't know, a new command. So this is a place where if you read four this might be a keyword. When you are somewhere else you say I already have started parsing my command. When I read four I will keep it as a simple word and work that way. So this is something that is provided by these many a parser generator. This allows us to have mainly really high level codes that you can easily relate to the standard. So you take the grammar that is written in the standard, you take the grammar that is written in our tool and you can see that they do the same thing. And if you want to know more about all the things that happen inside this tool there is already probably the mini-depth talk that has been recorded. What more big produces is what we call a concrete syntax tree. This is a huge tree in our case with more than 50 recursive type definitions. And in fact what this tree represents is all the grammar rules that have been used to pass the script. So I don't know if that's readable on the slides but you have like one type pair grammar rule so a complete command, a C list and all and then one constructor of type pair, sorry about that, pair rule of the grammar that you applied in that case. So this corresponds directly to the grammar of the project standard once again and so you can work on really what has been used. In our case we would want to crawl through these trees to traverse them to look at certain places like what happens in redirections, in function definitions and so on. Since it's a huge tree we would have to code functions for all the cases of the tree saying mostly just keep going down in the tree and traverse the tree. So that would be really painful. Hopefully we use something called the visitor design pattern. So there are what we call the visitors and you have several kinds of visitors that are generated automatically so we don't do that in the more big parser and there are objects that just do nothing but traverse the whole tree and it means that you can later just inherit from these objects and overwrite just the part that is of interest and it makes it really easy to write something that go look into only simple commands not touching the rest. We'll have an example in a moment. So once we have this parser and these nice tools we've written a tool just that runs statistical analysis of shell scripts because we wanted to know what was in these shell scripts because that would guide our intuitions and what we would do in the project. So this is another tool that is also available in GitHub. We call it SHStats. We didn't have a fancy name for it but it's nice. It worked directly on the syntax tree that's more big outputs. It has a preprocessing phase for the scripts where it tries to expand the parameters that you can know locally that if you have a constant that is defined at the beginning of your document you are able to replace it everywhere in the document. I'll talk about that later. And because of the visitor design pattern it's easy to add an analyzer module. And here is an example. So it's quite a lot of stuff but let's break it down to pieces. Quite fast. So Analyzer, you just have to give it a name. You give it a name and the command line options that it requires. That's just for SHStats to know who this analyzer is. And then you write a process script function that will just be the way SHStats provide the scripts to the analyzer. And in that case it's a simple analyzer that just tries to see if you can find $ the character $ in the words. And what you do is you create an object so it starts here, you create an object you inherit from one of these visitors and you just have to override one method and this is what is here. You just override the method that visits the words. So all the other functions will still be the same and will just crawl the CST. And this specific function will just test whether you have $ in the word. Which makes it like in 10 lines you've written something that goes down the whole tree and read at a certain place in the grammar if there are dollars. All right. And then what you do is you just ask this object to crawl the whole tree and if it returns that yes indeed I found a dollar you add the file name into a list of files that contains dollars. I think. And then you have just to be able to output a report about what the analyzer did. In that case you just write here is the number of scripts containing a dollar and then the list of scripts if you want to see why they are here and what kind of dollar they are. Right. In that case, just counting dollars have been done with grep and actually that's what I was doing like two years ago a lot of grep things. Except that then you see that you have detected dollars that were in commons so you have to remove commons you can still do that with said maybe and then inside quotes and then there are here documents that are not expanded so you shouldn't count the dollars in there and then so each time you see that I can't hear you. Please go to the mic. So I think what then we're saying if you do a grep you get only one occurrence per line which is counted but I think there are options to grep which allow you to count all the occurrences on the line. Well the point is grep is limited because well limited it does what it is supposed to do but we need more than that here and in this case so first having really this parser and this street traversal allows us to have this expansion phase that help us like expanding variables we'll talk about that I think in the next slide and then you can do much more complicated things like if you count the dollars but you don't want to count a dollar that is a variable bound by a for loop for instance we can do that easily in our tool and grep through completely well it's not made for that and I'll leave Ralph to talk about this expansion So I would like to talk a little bit about this expansion which might seem trivial at first sight so what it does is I'll record you that we try to analyze the script statically that is without executing it however there are some when we do that often when you write a shell script you do definitions bindings of variables which are in reality constants so you bind them once and then you use the same value for the same variable all the time and you would like to exploit this in our analysis so stuff like this so you define a variable x and then in the branch you define a variable y so you would find in line 4 in fact statically even that x must be 1 and y must be 2 and on line 7 you would find that x still must be 1 and now y must be 3 because now you executed the else branch not the send branch however if you drop out of the conditional then you know still that x has value 1 because you haven't changed it anyway during the conditional however you don't know now what value of y is it could be 1 of 2 or 3 but we are doing not this kind of set based analysis now we just would exploit the fact that we know that x has value 1 and y we know nothing about y ok so this looks quite easy however it is shell and shell is weird and well let's start to explain why why we think that shell is weird let's do a little quiz so imagine you have a shell script and in the shell script you have these 3 lines written on top of the slide so x is 1 then you have something like x equals 2 and a call to something foo whatever and then you do echo dot x so what should in your opinion be the value which is printed by this so possible choices I give you 5 choices the choices are 1 2 73 syntax error or it depends ok any other opinions 5 so 2 people are saying it depends very good it depends on what ok at x equals 2 that variable setting is scoped to the line so at first when you are just learning shell and learning it well you would say oh well because that variable it goes out of scope after the command is executed x must equal 1 however it's not that you can't be so sure because foo might be something that sets x and then re-exports it to the environment that's true too ok so it depends ok thank you so it depends in fact on several things the first thing it depends on is whether foo is a function or a special built in of the shell on the one hand or whether foo is an external command which will be spawned as an external process so if foo is something like bin whatever which spawns a process then in fact the assignment of x to 2 is local to this line so in this case you would see the value of 1 however if foo is a special built in of the shell or is a function then the assignment of x to the value of 2 will be global to the context of the shell and in this case you will see the value of 2 however if foo is a function foo itself can also set a variable x as you correctly pointed out and then you could indeed obtain the value of 73 in this case however the oldest shells didn't have functions so it must be 1 if you're using the oldest shells well we are talking about posic shells and posic shells have posic shells have master functions so this is definitely something in all the shells which are permitted by policy something which is possible and special built in anyway and special built in ok so I think we have to hurry up a little bit so just to show you that this is a more complicated case so this is kind of surprising in case you don't know what it means so here you have a prefix then you have here something which will spawn a process and then you echo something and in fact what happens here is well I can execute it stage assignment process what it does is it prints A so this is the first line this prints A which is kind of surprising because in the prefix you said that x is B and y is something and also C gets an assignment by parameter expansion here however this is not yet visible when you expand the suffix of the command which is kind of surprising ok then we have seen the spawn a process so all the assignments which have been done by the prefix should not be visible after execution of the process however what we see this dash and it will be the same this batch in the posix mode is we see A and C which means that y is the assignment of y which is not visible here however the assignment of C of offset is so it's kind of weird so the semantics is weird and the syntax is weird and we are having a lot of fun of discovering various features features of the shell here so it's not trivial at all so now what we found effectively on analyzing the corpus of maintenance scripts first we found some really really trivial things that have required a parser at all so this was really really easy the first thing is missing she-banks in maintenance scripts in fact policy says they should be there in the maintenance scripts and we found about 40 packages which did not have them and then we did according to our regulation to our rules what has to be done so we announced a mass bug filing there was some discussion about the severity it was appropriate for this and we settled on important and almost all of them have been fixed so far, so thanks for that there are only five remaining of them and these five are precisely the five ASCII scripts ASCII files I talked about when I did the breakdown according to language then Z-E Z-E is necessary to make shell scripts fail in case one of the commands in the shell script itself fails and they should it's written they should be there this should be done by any maintenance script either by setting Z-E this is the normal way to do it or by taking care of the way how you invoke external commands and make them fail fail with and with an exit one in case there is something not expected so we found again some 50 packages which do not follow this policy all of these cases to ensure that the maintenance did not something else to make it fail as it should and we did some mass bug filing again after discussion and about a quarter of them have been fixed so far then we have the case of local maybe we should skip over this because we are already a little bit short of time local is also strange in shell everything is strange in shell the reason why local is shell just very very quickly is local does not have an end as in Java where you have a bracket group and you have a local variable local in shell is just this from now on that variable is local ok and then you made stuff something like this which makes a variable X local dependent on something which happened before and this means that the parser if you imagine that you would like to add a compiler for shell the parser cannot know whether a variable is local or not and that's strange and we found indeed that there are some cases we did it just yesterday so this we didn't look into this in detail but there are 280 cases indeed where there is a local inside a control structure for a while sometimes even and we have to look more in detail into this to see whether this is a problem then we have more stuff I think we should skip also over this then we did an analysis of the commands and I give back to Nicola of the commands and we did this in fact because we just wanted to know which commands are mostly used in the script in order to know how we should build up our model I was just checking you weren't seeing the results so in your opinion what would be the three most used so maybe we won't do that we don't have time but you can just imagine what would be the three most used do you really see that alright it's visible so the first one is tests which occur in like half of the scripts but then more than in average four times the script people seem to like tests that's good then you have sets which is presented in most scripts which is something we almost already said the third one is true actually there are differences not files because people love to do or true just to forget about a mistake so there are not so many files using them but when they do they do use them a lot and then you have others so we have interesting maybe you have which that is not so here you have which which is present in like half of the files so if we were to order them by files then test then which and we see dpkg main script helper that's good to say that or depth system developer and maybe we won't pass so much time on this slide so we have an analyzer that gives us which command are used and in which way so for instance if you look at in which way set is used you see that in a huge amount of cases it's for the minus e flag which once again is not really surprising if we look at like potentially dangerous commands like remove you have that most of the time it's the force flag that is used and sometimes recursive and force which is like the frightening part and if you look at for instance dpkg you can see that so four out of five cases of dpkg are just there to list the files of the of the package probably to remove some of them maybe I'll just do it and yes when you look at that so the first thing you have to do is you look at what is used the less because it's sometimes where weird things happens and for instance this is how you can see that you discover make dir minus f you say what is it you check that it doesn't exist and we find a few bugs like that in that case it was pkg then we looked a little bit more in detail on tests we did a little we wrote a little parser for the test expression itself and so we have here some statistics about different comparison operators and tunery file test operators we won't talk about this I would like to talk about something else which is more interesting in fact well maybe here you see these are the binary operators which are mostly used luckily for us the last two ones are not POSIX they are supported by CNU CNU test and in fact NT is newer than comparing the timestamps and EF this would be troublesome for us for doing theoretical mobilization of file systems because it means comparing whether two parsers point to the same I node and if this were used a lot it would mean that you would have to model file systems like DAX like graphs and the fact that this is not used means we can just model them as trees and just ignore the cases where something like EF is used I would like to talk about minus A and minus O why because they are mentioned in the policy so policy again says test if implemented as a built-in must support minus A and minus O minus A and minus O are for end and for all to combine complex tests however when you look at what POSIX says POSIX says well it's obsolete first of all it's an extension only it's not in the core of the POSIX standard and furthermore it's mentioned as being obsolete and it's recommended not to use it why in fact the CNU page says exactly the same thing and the reason why both POSIX recommends not to use it any longer is the same because it's ambiguous and why is it ambiguous well this comes from the fact that test expressions contain also a special case which allows you to write a test without any operator in fact if you write in the test expression only one single word it is already a test in fact it is a test whether that word is not empty or not minus N in front of it in this case you can just drop the minus N and that makes the rule thing completely ambiguous for instance a valid test expression is something like this parenthesis equals parenthesis and there are two ways to read this and first of all you might find strange that someone would write it but maybe someone didn't write it directly but he wrote parenthesis dollar one equals dollar two parenthesis so the parameters are expanded to nothing and then you obtain what you have written here in fact you can read it either as left parenthesis is the same as right parenthesis, this would be false or you could read it as the string equal is not empty and this test is written between parenthesis and this would be true so it's ambiguous in its structure and in fact this can also lead to an ambiguity in the result and then you can do funny stuff like this this is in fact legal, a legal test minus A minus A minus A minus A for me there's only one possible reading but apparently there are other opinions because in fact dash says the result of this is zero and bash even in posix mode says it's one so it's weird again and maybe this is a good reason why we should not really use this and in fact you can of course replace it just by the end and or operators of the shell which is a much safer way to do this okay another interesting thing is and this again a design flaw of the shell is that we found almost ten errors in the test explorations in fact syntactic errors in the test explorations in the maintainer scripts and now one has to understand why this hasn't been detected before one would expect that if someone does a syntactic error in the test exploration then the first person who installs the package will find this error and complain and send the bug report this hasn't happened, why? in fact the shell confuses the boolean conditions to and falls with persons and absence of errors okay and this is the problem if you have a syntactic error in the test exploration the test operator fail and the shell sees it at false you might get an error message printed to standard error but this is easily ignored if you install a lot of stuff and the result is just that the test does not behave as you would expect it to do okay so I will show you some of these of these things we have found stuff like this, this of course is false pass find would be a function in this case you would want to apply the function pass find to the argument so this should be something like this this is something we found several times in scripts here what is missing is just an or operator between the second and the third sub-expression why hasn't this been found before one might ask well the reason is that in fact this test will succeed in case the dollar one is removed and in all other cases including disappear and purge it will just fail so it does something but it does not what the maintainer intended this is another one, a classical one of course this is a problem with cutting, with decomposing the input into tokens what is missing here is the space before the break it everybody has done this mistake and then again the test just fails, it always says false this one, there is a missing continuation line for the same reason this also fails just miserably and we found something like this the less simple is not even a post-ex-test the backslash is probably not necessary and anyway the person probably intended here to do a DPKG compare so stuff like this we found nine of them I think and of course we filed bugs against all of them and then we have redirections yes that's a we should maybe hurry because ok this is something we found recently by just saying our redirection used yes probably but what's in there and actually you can see a lot of these where you with these two redirected to one and then one redirected to DevNul do you know what it does I I should know what it does so what it does is first it takes whatever one was writing to and say two should write to the same thing right so if one was writing to standard output two is now also writing to standard output and then it tells you and now one should write to DevNul but that means and this is probably not what is intended when writing something like that if this full command here logs something on the standard error this is not redirected to standard output instead of flushed away and you probably mean the reverse that is first your redirect one to DevNul and then two to the same thing as one right and actually there are like more than a hundred occurrences of that possible problem so we should discuss probably at some point if we should do a mass bug filing and also you discover like useless redirections like one should redirect to the same thing as one or one should redirect to the no yeah wait yes one should redirect to the same thing as two and then one should redirect to DevNul so we can detect a few so these are like non dangerous bugs but still bugs not so maybe to conclude really fast in order to have a few questions maybe so we are in the Koli project launched by Ralph here and we aim at like checking the correctness of Linux scripts in general and in fact Debian packages in particular this is a project founded by the Agence Nationale de Recherche which is just something founding research projects in France we still have two years to have fun with shell scripts you can check the main page and in the future we want to go further and not only on the syntactic level but look at things with funny names like treat transducers, symbolic execution to check more interesting properties of scripts and I think we have like a short time for questions thank you for your attention Hi there I haven't done any packaging in a while but we used to have .config scripts which were also shell scripts and they interacted with debconf are they still around and if so shouldn't you expand the scope of your searching to gather those as well possibly well you have seen that in our project we are already late now and it's quite complicated so I think for the moment we prefer to focus on the maintainer scripts they are all doing similar stuff config scripts are maintainer scripts or at least they were 10 years ago 10, 13 years ago it's 2018 now they belong to the package they go into var, lib, deep package scripts, oh my gosh I'm an old timer I used to maintain x386 back when x386 was a thing I'm sorry what's your name oh Brandon Robinson I've got a lot of painful experience with shell scripts I love what you're doing this is excellent work in fact these config scripts are just present statically in the archive too and we can probably just copy their contents when you do the source it's basically just copying the contents and then handling them it's probably not much of a problem to work on these config scripts they're just named package named .config and mostly they just do reads and writes I don't understand .config yeah it's a debcoff thing so that was the last like maintainer script that was added it started happening about 2000 we still have a lot of .doc base files there and of course we should include them in the analysis because they are setting variables which are used in the script and the debcoff don't call it a registry I saw your set minus z apparently nobody's ever using the earliest the original idea of set set the positional parameters they're just setting the options nobody's setting a 1 and 2 and 3 positional parameter so we didn't see this in the script well we have the results if this is used this is like in less than 0.4% of the cases of set so this is really not so much used indeed alright I think we are out of time anyway yes he agrees so thanks again