 Hello everybody. I'm Dmitry Rivin, the maintainer of stress, and today I'll talk about a new stress feature called Fault Injection. So, does anybody know what stress is? Nice. So, most of you know, stress is a traditional diagnostic debugging and instructional user space utility for Linux. How traditional is 25 years old, so quite traditional. And traditionally, it's used to monitor interactions between user space process and Linux kernel. The most known are system calls and also signal deliveries or changes of process state. Stress has traditional, everything is traditional with stress. It has traditional command line interface and multiple filtering capabilities. So, because the interface is traditional, it's easy to use for people who are used to this for 20-something years. And it's quite powerful because of these filtering capabilities. But last year, stress has been extended to do something very untraditional, that is, to tamper with stress by injection faults. It's called fault injection. The current implementation is based on the work made by GSOC student of last year. I'm not sure how to pronounce his name, so I'm sorry if it's wrong. He will someday correct me. So, what is a fault injection? Just a brief recollection. It's a software testing technique used for improvement test coverage, mostly of error handling paths that might otherwise fairly be followed by introducing faults. This is a nice definition from Wikipedia. Thanks to Wikipedia for this. So, where do we place stress among other fault injection tools? It's obviously software. It's runtime. Most of the instrumentation tools are compile time. This is runtime. It doesn't work by means of CISCOL interposition. It's user space, unlike many current techniques we have. It's unprivileged. And as I said, it has traditional command line interface. So, now I'll show you a series of examples, but hopefully we will get you an idea what this is called fault injection syntax is and what you could do with it. And I'll start with a simple program cut from corgutals, which is linked dynamically with nullipc, and we will see what could be done. So, in the top box you see a traditional output, and in the bottom box something with fault injection. So, let's filter all open CISCOLs of this trivial command. And let's fail them all. What you can see here is it ends in dynamic linger. It tries to open its cache. It fails. It tries to open libc in redefined locations on this architecture. There are four locations in this build of libc. It fails. Nothing very... It's quite predictable. Let's do something different. Let's change error code that dynamic linger will get from default one, which is in a CISCOL function, to something that has a different meaning for dynamic linger. As you see, it tries twice as many locations when it sees error code in the end, the error code of open CISCOL. You can inject fault not on all invocations, but just in the first one or whatever else. So, if you inject a fault just to the first invocation, you can see that dynamic linger just does all the same. It tries libc in different locations, finds it, and everything works. But if you fail second and all subsequent invocations, what happens? Dynamic linger opens its cache and tries to open libc according to the cache. As you see, it fails. And then it tries all these refined locations. And you can notice that it tries to open libc at the same location it tried before. Why dynamic linger does this? Because there are different code paths, and they don't know about each other. In the first path, it tries to open. In another code path, it also tries to open. It doesn't care, but it's already tried this. It's not a bug, just a funny thing. Let's dynamic linger do its work and have a look at the cache itself. So, the third open CISCOL is the CISCOL made by cache itself. And it looks like what you would expect from a cache. It fails and it exits with error code. Another way you can specify faults is to say how often you would inject them. For example, you can inject them starting with third one, with third and then every second one. So, you see that cat handles the situation outside property. So, it reports error on every case when there is an error and processes everything that's opened and exits with error code. So, looks like cat works fine. And now let's have a look at something more interesting. Let's combine fault injection with path filtering. So, in the top box, you see a sequence of CISCOLs that are related to the files passed to cat. In this case. And in the bottom, I will just fail each of these CISCOLs. So, we'll see how cat handles this. So, the first one we've already seen and the second one is FSTAT. And for some reason cat considers this as an error. It's quite an unusual thing that FSTAT fails. So, probably it's the same thing to do is to fail anything. From another side, the FADVICE is called, it's advice. So, cat is also quite right that it ignores this error. It's just an advice to kernel that cat is going to do some sequential reads. More or less the same with read. If you fail a read with a hard error, it's quite right to report it and fail. But what do you think cat will do if it sees temporary error? Would it restart read or would it fail? It would restart. Cat is a good program after all and it knows that interactive system call is a temporary error that should be restarted. So, and this is a peculiar thing. And this is a, then you fail closest call that has no importance. What cat does? It just opens file, process it and close it. It opens it read only. And what's the use of reporting this as a hard error when you can't close a file open read only? I don't know. I would say it's a minor barking cat, very minor to complain and exit with error code. Because the file was successfully processed from the beginning to the end. So, why does it... Here you can see in brief that you can specify different fold injection expressions that would work and in the top box. And the second box you see that a stress actually can follow descriptors. So, even if cat has no idea what the file it's working with, a stress does know and can apply its filters. So, it's like what you can see in the bottom box, it's a primitive access control using a stress. And let's have a look at real box. This is a more or less famous bug in Python 3.5. It was found by the student who did this GSOC project. Python on every invocation needs some randomness for instantization. And when it fails to obtain this randomness, it's a fatal error, which is fine probably, but it's not fine that it throws a segmentation fold at hexadecimal address 50. It's not good. It's a bug. Why it does this ridiculous thing? Because it addresses a method of an object which was not allocated because of lack of randomness. So, hexadecimal address is just a set in a virtual table like this. Fortunately, this bug was, seems to be at least fixed in Python 3.6. First, they were working around with it by using getRandomSys call. But I actually tried to false getRandom and found out that it no longer sick falls. That's funny thing anyway. And another real bug was found with dynamic linker itself. So, if you fail and protect, you can see that dynamic linker from libc ignores error from the first and protect error. But it treats all subsequent errors as fatal. That's quite natural to do so with all subsequent. But why it ignores the first error? It's because there are different code paths. And the only code path that exists in current's libc's dynamic linker is the one that's early in its running and it just ignores the error. And this error actually can happen, for example, because of fragmentation. And the call that's failed is the one that tries to remove access. It's for none. So, actually, some pages are remains to be accessible, which I suppose not to be. So, it's like it would be a minor security issue if it's not a management problem. Because if first and protect call fails, it's very likely that all subsequent also fail and they are properly handled. But still it's a bug. Now I'll try to explain you what's going on under the hood and how it's all implemented. So, when traces invoke the syscall, kernel puts it in so-called syscall-enterstop state. So, it's completely stopped. At the same time as trace awakens, it fetches syscall number, its arguments, applies filters you've seen on all other kinds of filters as trace supports. It decides and may allow skip this syscall or print something. And then it tells the kernel to go on with its trace tracing. Then kernel executes the syscall and before passing control to user space, it puts traces again in a stop state. It's a slightly different state. It's called syscall exit stop. At this time as trace awakens again, it may fetch syscall return code and arguments. It depends whether the call is filtered or not. Also it prints at this time. This return code is necessary and tells kernel to let the trace go on. And all this cycle repeats itself until something happens with trace, whether it's exits or whatever. So, taking these two parts together, you can see the sequence. And in this sequence, there are two places where a trace actually can't tamper with syscall. With syscall number, with syscall arguments, with syscall return code. And this is exactly the way how syscall fault injection is implemented. So on an entry in syscall, a trace replaces syscall number with an invalid one minus one. And then kernel sees this invalid syscall. It's just invalid syscall. So it returns an error for the invalid syscall, which is in a season most architectures, but not on all of them. So you shouldn't rely on this error code. And then on exits syscall, if instructed to replace error code, it replaces this error code with the one that was specified. So it's actually pretty simple inside. But this is not the only thing, as you can see from this slide. It's not the only thing that could be done, not just syscall fault injection, but other kinds of injections and tamperings. For example, this is some recent development, not in the erased trace version yet. You can inject a signal at any of these points. And current limitation, it's injected on an exit in syscall. So what would you use this signal injection for? For example, if you want to terminate program, you would probably use a different signal for this. Or if you want to dump it for the analysis by another tool like gdb or another way you can tamper with is to replace return code of syscall with something that doesn't look like an error actually. So you can pretend that syscall completed successfully, but actually skip it. So by replacing syscall number on entering to an invalid one, this way you skip the syscall. And on exiting, you can replace return code with like zero, which for syscalls like unlink means everything is fine. So the traceer will suppose that unlink is complete successfully, and the file is gone, but it's actually... Yeah, this is not as easy as it looks. For more complicated syscalls, they have some semantics that has to be followed somehow. For example, many syscalls, when they succeed, some portions of memory are filled with some useful information by the kernel. So for them, this is not as easy. But for simple syscalls like unlink, it works from current master. Probably I forgot to tell that this API as trace is using is called ptrace. It's the same API used by... It's a traditional interface as I think about trace. So it's the same that's used by GDB. It's more or less that all I would like to tell about fault injection, but I can't answer your questions because some things are better answered in questions. So by the way, the title of this talk was probably invented by the student who made this GSOC project. You can visit his GSOC page and maybe you'll get an idea what was planned in the beginning and how it's changed and compare these things. You can grab the source. Syscalls fault injection is part of the recent release, which was in December, I think. So it's... We can use it. Do you have any...? I'm kind of curious if... Are there testing tools you have seen that are built on top of the fault injection in S-Trace app? So this fault injection makes S-Trace a testing tool, but it's quite a modern thing, so I'm not sure it's already widely used as a testing tool. I'm thinking of using S-Trace itself for a S-Trace test use because it's not easy to reproduce some faults related to P-Trace system call, for example. So just a new way of fault injection, a new way of testing something that otherwise would require a compile time instrumentation, and it's probably the only runtime unprivileged user space fault injection tool for Linux we have now. So I think it's time to ask questions. When you screened the error, but how it could print so if in theory he couldn't be able to load the linker to his memory, so the print function shouldn't be filled. So which one of slides? So here, the cat is unable to open linker. So the question was how cat was able to complain about how cat was able to report an error if linker was unable to load libc. It's not a cat. The error diagnostics you see comes from dynamic linker. It's not a, it's still a dynamic linker at this point. So you can see the prefix is cat, but it's not a cat. It's dynamic linker who complains that it can't load libc and that's all. Any more questions? Yes, please. If you would instead of some harm, it's just called like at process ID. I expect that if you have some monitoring system like at union of same phase and even when it's just called number, might be more than an input system when just calling cat the ID, which should have no side effects also. First of all, I've been told, so the question is why the current implementation uses an invalid syscall number for fault injection instead of using some harmless syscall that has no side effects so that a monitoring system like SC Linux wouldn't or something like this audit with it wouldn't complain about something wrong is going on. First of all, it's not easy to find a harm syscall. I've been told here recently that get pit my fail. In current Linux channels, get pit my fail. It's not allowed. Okay, so in theory it's possible to invoke a get pit, I think. It's not a technical issue, but from optimization point of view, it's probably less work. You will need less syscalling vacations to do this. On exiting syscall, you'll have to replace like two registers on some architectures and in case of invalid syscall, you will not have to replace at least the register that contains the indication that it's an error. I think it's slightly faster to use an invalid syscall, but I think it's quite a valid point. Thank you. Any more questions? No? Really? Okay. Yes, please. Is there already something there, probably external to S-Trace, to do random like fuzzing stuff that randomly some syscall fail? So the question, as I heard it, was is there anything probably external from S-Trace that do some random, what's a random injection? Random what? In the initial GSOC design, there was an interface to do some random injection, but later I decided that it's better done by a fuzzy driver that runs S-Trace. So my idea is that if you want randomness, you just do it in a driver that runs S-Trace and not in S-Trace itself. So currently this injection is predictable. There is no random injection. But it might change if somebody would suggest a plausible case how this would be used. Because currently I think it's better done in a fuzzy driver outside. Is it a problem that the command line is long? Okay. Does anybody have any more questions? Yes, please. I'm not sure I hear anything, but I will have to repeat what you're saying. So please make it available at least for me. Okay. So Steve said that it's a, well, it probably is like just a good words about this. I'm not going to repeat this. Thank you. So coming up next in this room, we have a quick talk on testing web applications. Do you have a, I suppose there was a problem.