 Hi guys, it's nice to meet you at the open source summit 2021 again. My name is Jeongwon Kang, working at the affiliated institute of Atree in South Korea. As you know, we are in the COVID-19 pandemic situation, so I hope to see you in person next time. Today, I'm going to talk about a practical approach to control and authorize the execution of interpreters. As follows, the presentation slide consists of seven parts. Introduction. In the Linux system, there are many features to prevent or limit unauthorized execution, such as file mode bits, access control list, IMA EVM, FS Verity, no use amount option, and mandatory access control frameworks. It's too many, right? I think you have heard about them if you are interested in hardening security for new systems. However, do you think are they enough to control the unauthorized execution? In my opinion, they are not because of the interpreters and script files such as Python, Perl, and Ruby, and so on. Of course, they are very powerful in making a program that compiles the languages, so we use them, but it's a program that most cyber adversaries also like to use it for hiding their activities and artifacts. Also there are fileless morewares using interpreters in Linux systems. You can find out the morewares easily on the internet. To solve the issue, there are some previous works by ClipOS. OMEI is a flag of Open System Hall, and Astra Interpreters Lab, and OMEI is a counterpatch of counter self-protection process. There was also consideration regarding interpreters and script files of Chromium-OS development team. Anyway, we will look into the works in more detail in the related work slide. So what's the problem? As a result, they are insufficient to restrict interpreters. In short, they have some kind of force and flaws. That's why I proposed the approach related work. Let's take a look into ClipOS first. ClipOS is a kind of Linux distribution focused on security and developed by ANSI, a French government organization. ClipOS has a feature to control script files based on write, exclusive, or execute. As you see, it added a new flag of Open System Hall, so it decides whether to allow the execution in the corner, according to the path and no exact mount option. But the approach of ClipOS requires patch on interpreter and on Linux corner. As a result, it makes maintenance issues to keep up with a new person. Next, Astra Linux. It was developed for similar purpose for security like ClipOS. In the Astra Linux, the Astra interpreter's law is one of the laws that are security features. It means that Astra Linux considered many kinds of stuff for security. If you are curious about it, you can download the install image file on the website. Anyway, as you see, it controls many interpreters such as Python, Perl, Dash, Ruby, and so on. In detail, the Astra interpreter's law specifies the interpreters into the extended attribute via finding out each file. The pictures are showing what I described, and Astra Linux modified the Linux corner like ClipOS. On a side note, it wasn't working well when I tested it, and I didn't know the reason. Next, ChromiumOS. According to the ChromiumOS document, we can understand the reason why the ChromiumOS limits the interpreters, even Dash and Bash. So in ChromiumOS, most of the applications were implemented by compiled languages, not script files. So Dash and Bash are allowed restrictively only for development. Last, OMEI exec corner patch of corner self-protection process. As far as I know, it was proposed by Michael Salon from the ClipOS. He suggested OMEI exec as a new flag for the open add to system call. As we already know, it works in the path mounted with the no exec option. Also, it has the maintenance issue, depending on updated interpreters. Actually, I don't know when it will be merged into the main line design. This is a brief overview of my approach. Actually, I prefer to do the related work. From now, I will name the program implemented by my approach to interpreter lock for convenience. My interpreter lock works based on the right or execute policy like the ClipOS. It also handles the execution of an interpreter differently according to the range of UID. If a process has a UID more than or equal to 1,000, then the interpreter lock will control the process. It means that the interpreter lock only controls normal users, not root or administrator. Although I will explain in detail after, the target interpreters are Python and Per. Because most literate distributions install them by default. Also, as you see, the interpreter lock is a corner module, and it hooks the functions like exe-cv-e, exe-cv-f, and bprm, changing the p, within the corner by the ftrace feature. So it allows the execution of script files in the not writable path only for normal users. But as you know, the administrator like root can execute interpreters and script files in any path. Of course, the normal users and processes can't execute interpreters and script files in a writable path. In summary, my interpreter lock controls the unauthorized execution of interpreters and script files by the path and UID. Okay, let's have a look in more detail. You can divide the path whether writable and executable. As I already described, they are the exclusive or relation. I think you may be wondering that how I divided the path. First, I looked into the permission of path by users if they can write or execute. Also, to do it, I referred to policy manuals of Linux distributions. As you know, it describes the policy requirements for the distributions. It includes also the structure, contents and specifications for everything of them, such as UID classes, the list and purpose of path, and so on. As I mentioned, the interpreter lock only controls normal users who have UID more than or equal to 1000. So, the normal users can't execute interpreters and they are limited from executing the script files depending on the path. I already said that the interpreter lock targets Python and Perl. But there might be so many interpreters to control, right? Even I didn't know. Anyway, I classified the interpreters if I installed or not installed by default. So I chose Python and Perl. How about bash and dash? They are only for a command line tool, not much programmable than other interpreters. As you know, we can't live without them in the Linux system. So the interpreter lock doesn't care about bash and dash. To control the interpreters and script files, I thought it was proper to control them when it is execute. So I took a look into booking techniques by user mode and counter mode. At first, I wanted to solve the issue in the user mode by adding preload and Pinterest techniques. But they had its limitations. I'm sure that there is no best way than not to use interpreters. As you know, it's impossible practically. Modifying interpreters are two. Eventually, I solved the issue with the after-hook in counter mode. It is very useful for hooking the execution path of any functions within the corner. But it's not a panacea, right? To hook functions in counter mode, I have to consider stability and other things. Also, the Linux corner is really progressive. So I have to upgrade my interpreter lock according to our new version of the Linux corner. Implementation. On this slide, I will describe some kind of stuff for implementation briefly. I implemented and tested the interpreter lock in Debian and Ubuntu. You can test the interpreter lock on the distributions. Also, I implemented my interpreter lock based on the after-hook. I was able to save my time of trial and error thanks to it. Anyway, the interpreter lock hooks the three functions. CIS, EXE, CVE, CIS, EXE, CVS, and BPRM change into P. I will talk about more the three functions in the next slide. There are two ways to run the script language. First, we can run it as arguments of an interpreter binary. The second is running the executable script file directly. As you see, they are handled at the point of EXE, CVE, system core within the corner. So the interpreter lock hooks the point to control that. Next, I will describe the second case, which is executing script file itself. As you know, the script files include the shebang at the first line. It indicates which interpreter executes the script file. As you see, the BPRM change into P function chooses the proper interpreter for the script file. So the interpreter lock hooks the function and controls the execution of script files. Next, I'm going to talk about EXE, CVE, and system core. As you see, it is similar to EXE, CVE, system core. But the differences are a way to execute files. It executes files depending on the combination of directory file descriptor and path name. Of course, the directory file descriptor is obtained by the open system core. If it is known, then it will execute files by the path name only. Also, if EXE, CVE function calls the EXE, CVE, add system core within, I think you can understand how to use the EXE, CVE, add by the examples. In this slide, I'd like to show you how to hook EXE, CVE, and EXE, CVE, add system cores. The upper part is the interpreter lock and the lower part is the Linux corner. I made three functions and most of the work is done in the common function like the Linux corner. Also, the work is checking the file path execution arguments, environmental variables, whether they are valid. To inspect, environmental variables is done at the hook common and hook BPRM changing the P. The inspection is no problem at the hook common function because the address of the environmental variables is in the argument. But at the hook BPRM changing the P, it isn't easy to get the environmental variables. So I refer to an auxiliary factor. There is some extra space and it is OK at the moment because the Linux corner will initialize the auxiliary factor later. Also, I implemented the real path function to identify the precise path of interpreters and script files. If we want to hook, if we want to look into in detail, I recommend you visit my GitHub. The URL is on the last slide. And for EUID separation, like the source code on the slide, I did it to control only for more than equal 1000 EUID. Discussion. In this slide, I will describe some issues that I considered during the implementation. During the implementation, I suffered from many bypasses. Finding and fetching was really time consuming work. To handle it, I had help from skilled hackers who are working at theory in South Korea. The first case is to copy a binary file of interpreters. It's impossible to limit the execution by the copied interpreters. Of course, if I track the binaries from original duplication, then I can serve it. But tracking files is another work beyond the scope of my topic. Anyway, we need help from other security features like IMAEVM or NoExecMount option. To use IMAEVM, we have to sign every binary file and manage them. It is not a simple job, so I recommend using the NoExec option. It isn't complex as IMAEVM, but we should separate the home directory from the root partition. Another case is a hard and simple linked file. I could handle it by the real path function simply. As you know, interpreters have many options to execute a script. Also, there are environmental variables to consider. I developed the interpreter line, which handles all of them. In this slide, I will talk about unusual cases that bypassed the interpreter line. The first is the dynamic linker. I haven't seen it to execute binaries by the dynamic linker directly. So the interpreter line limits the case without difficulty. But another case by user-forged AFTI system core wasn't easy to serve. The original purpose of user-forged AFTI is to enhance performance by handling the page-forged exception in usual mode, not in kernel mode. But these days, many hackers use it to succeed in exploiting the Linux kernel. Also, the user-forged AFTI technique was able to bypass the checking rules in the interpreter line. To serve it, we have to use another kernel that is above the 5.2 version. The kernel serves the sys-control option to control the user-forged AFTI. The last one is related to a technique for file-less malware. As you see, you can find the mem-afTI file for post-audio in the pro file system. And then you can check it by copying any binaries to the mem-afTI. Especially also the IMEI EVM can't block the mem-afTI technique. In my opinion, the mem-afTI should be controlled for security. Anyway, I observed if there is a case that executes binaries by the mem-afTI. But I haven't found any case, so I added the mem-afTI file name to the checking rule in the interpreter line. There are also exceptional cases such as Python debugger, Python document, hyphen for bash, and a symbol and link tree. In the last case, the interpreter line mishandles the symbol and link as an execution option. Anyway, I fixed our issues that we checked until now. Frankly, it was a kind of paranoid war. Conclusion, although the interpreters and script files are convenient for developing programs. But they must be restricted if we focus on security. Unauthorized users can do any what they want by using interpreters, but controlling that is difficult. The best way is to remove them entirely if we can live with binaries only. So I proposed a better approach to handle the issue than other related works. As I mentioned, the approach needs the prerequisites, but I think it's a reasonable job for security. Finally, my presentation is over. If you have any questions, you can contact me by slack.