 Hello everyone, I'm Vidraksh Parikh. I'm a maintainer of CubeArmor and a software engineer at Akinox and today I'll be talking about securing Jupyter Notebooks using CubeArmor. What are Jupyter Notebooks? They are tools for simple and interactive computing, popularly used for creating data science, scientific research, and machine learning workflows amongst other use cases, and they have multiple deployment modes. However, the one that we'll be talking about today is Jupyter Hub. Jupyter Hub allows you to create large-scale multi-user deployments of Jupyter Notebooks. Though it's noteworthy that CubeArmor is able to protect all of these environments. CubeArmor is a CNCF Sandbox project for runtime security enforcement. It does so by monitoring activities like process execution, file-in-network access, capabilities usage of process sending systems using EVPF, and it enforces user-defined policies for restricting the above activities using Linux security modules or LSMs for short, like BPF-LSN, AppArmor, and SELinux. It is able to protect workloads running as Kubernetes ports, Docker containers, bare-metal VM processes, and so on. Now, let's take a look at the deployment model of Jupyter Hub and try to analyze its attack surface. We have a Kubernetes cluster, which is running multiple instances of Jupyter Hub all isolated by different namespaces, which are targeted towards different groups of users. Within a namespace, we can have multiple ports. Each port belongs to a specific Jupyter user, and these ports are managed by the Jupyter Hub deployment itself. The thing is that these ports are accessible over the Jupyter Notebook web UI to the user, and the user is able to do all sorts of things like execute code, execute commands, or access remote servers within the web UI itself, and all these commands will be run here. The attack vector that these expose a remote code injection, of course, through the web UI itself, someone can run malicious code, which might harm the cluster itself. Then there are container escapes. A user might be able to escape from their designated container to a different user's container within the same namespace or a different namespace, and so on. Now let's go over a quick demo, where we'll be taking a look at a Jupyter Hub environment and predicting it using Cubarmory. So I have a three-note GK cluster here, which is running two instances of the Jupyter Hub, namely Jupyter Hub Group 1 and Jupyter Hub Group 2, which are running into two different namespaces, and you can see there are existing Jupyter Hub ports here already. What I'm going to do now is, I'll be creating a new user through Jupyter Hub's front end and then doing some things for it. So I can get the IP address that is exposing the Jupyter of proxy and now I'll create a simple user here. Let's just name this user user one user one. The thing you'll note here is that I haven't really set up a domain for this environment. Neither have I set up HTTPS because it's just for demo purposes. However, it's not a recommended best practice to do so. Please make sure that you have these enabled. I'll log in, simple password, simple user. Again, not a good best practice. So when you see a new user is created, what Jupyter Hub does is it creates a new port for me and it creates a persistent volume so that the user state can be persisted and a couple of other things that it does. I can see that my user support has been created and it's running. Now that I have access to the Jupyter notebook environment, I can do a couple of things here. I can create new Jupyter notebooks, Python Jupyter notebooks, or create a new notebook, I mean access the Python console itself, or execute some shell commands using the terminal. However, what I'm going to do is upload an existing Jupyter notebook that I have created already. So you can see that what I'm doing in this Jupyter notebook is downloading malicious binary which is present at a remote URL using an internal package, and then I am installing it at a path that is not on the path variable of user, and then I'm executing it, and later once I'm done with my work, I'm also removing it so as to leave no clues behind. So you can see that it's this easy to execute any malicious code into this environment. Now, let's see what can we do, what can be some medications on it. So how can QBarmor come into the picture here? QBarmor will monitor all the activities of the Jupyter notebook users that are running on your Jupyter of environment, and then based on those activities, you can make some observations and create policies that will allow you to restrain the users. What restraints are we thinking about here? For example, we can allow users to execute binaries only from specific paths, so this will ensure that they can't download any binary on their own path and run it, and they will not be able to write these binaries, write any new binaries to the path that are reserved for system binaries, and then we'll only allow access of network to Python programs. This will also reduce the attack surface because the users won't be able to run any shell commands which might spawn a reverse shell or something, and then we'll prevent them from installing any global Python packages, and also we'll be following this. These are some of the recommended and some additional best practices that Jupyter have themselves suggest for running your environment. You can take a look at them in the official documentation. Now, how will this work, and will all my things continue to work? How we're going to do this is, when we install QBarmer, QBarmer will export these telemetry events, and these events are exposed for all your process, executions, file and network accesses, and then you can consume these events through dashboards, UI-based dashboards, or GUI-based dashboards, and based on the observability that you gain from these, and based on the observability that you gain from these, you can create policies, QBarmer policies. What we'll be defining in those QBarmer policies for these two things. So basically, in Jupyter environment, the binaries mainly reside in user local bin and slash bin directories, and any regular user should be able to get all of their things done with only access to these. So they'll only be executing from these paths, and no other path outside of this will allow process execution, and they won't be able to add any binaries to these paths. So to take a look at all these things, let's first install QBarmer. So when I go to QBarmer repository, I have this getting started guide, and in this getting started guide, it's so simple to install QBarmer. You just have a couple of help commands. You can directly copy these. Now you see when we installed QBarmer, the first component that gets created is the QBarmer operator. QBarmer operator is responsible for managing all of the QBarmer containers. It deploys a snitch job on all of your nodes. So I have three nodes here. So for each node, a snitch job was created. Snitch job goes over each of my node and analyzes it like a couple of kernel primitives that a QBarmer needs, and accordingly configure QBarmer for you and install it upon each of your nodes. So it's so easy to install and QBarmer operator takes care of all the configuration. You don't have to worry about all of them. Then another thing that gets installed is the QBarmer relay. So the telemetry logs that are exposed by QBarmer demon set are accessible by the QBarmer relay. And they are present at a single point here. So any consumers can connect to QBarmer relay and get those logs. So now that we have QBarmer running, let's try a few things out. We'll also use the QBarmer helper binary here. So QBarmer has this binary QBarmer client, which you can use for doing a couple of things like observing what all is happening, basically your telemetry logs or getting all the alerts and some more things that we'll look upon. So you can install it using this path, but I have it installed already. I'm just going to execute it. The command to be executed here will be QBarmer profile and let's take a look at quickly what all can do. So in the profile command, I can specify a namespace at the namespace level at which namespace level I want to observe all the events happening, or I can specify a pod. This will bring in all the executions and accesses happening at the pod level. Similarly, I can do it at the container level as well. Since we only want to observe our Jupyter user one here, I'll only specify this pod. So QBarmer profile pod equal to Jupyter user. Jupyter user. Now what I'll do is execute the exploit again. And you can see that I got the logs here for all the things that happened within that program. Like first let's start with the network ones that happened. So you can see that it accessed a remote IP and downloaded something. Then we can see that the binary that was downloaded and executed the home Jovian slash exploit binary. Now I should also be able to see file visibility. However, by default QBarmer disables cluster level file visibility because there are a lot of file events happening all over your cluster and it would be difficult for you to make sense of all of them. So the recommended thing you can do is to enable file visibility only for the workloads that you want to monitor. So I want to only monitor the Jupyter have group one namespace, like the file accesses that are happening on the Jupyter have group one namespace. I'll simply add Kubernetes annotation to it. So it goes like QBarmer visibility equal to process network file. What this says is have process, have network and have file all these kind of telemetry logs exported by QBarmer. So once I'm done with this, I'll be able to see all the file accesses. Let's execute this exploit again. So we can see that when this gets written to the path and what all is happening. So now you see this program executed and it tried to write to the slash home Jovian path and this action was passed because there is nothing stopping from it happening right now. So let's go a step further and now we can see what all is happening. Let's try to make some policies, make some QBarmer policies for restricting this. So what I have here is a QBarmer policy. QBarmer policy is a custom resource that's for QBarmer. This custom resource is for protecting the pods itself or the containers themselves. And there's another resource, the QBarmer host policy. It's very similar. However, the only difference is QBarmer host policy is for protecting your node and QBarmer policy is only for protecting your container. So what this policy says is it has a name and it is going to be applied on a namespace. I have chosen the Jupyter half group one namespace because that's where my user resides. Then I have a couple of selectors here. What these selectors are, let's quickly describe my user's pod. So if I take a look, describe pod, what's the name of my user, Jupyter user one. Now you'll be able to see that these labels are taken from the pod itself. The app Jupyter Hub label is common to all Jupyter Hub pods. So we take a further label, component single user server. So for each of the user that signs up, the pod that is created is the single user server pod. This is the component that Jupyter Hub calls it. So we take this as well. What will this ensure that in force on the containers or the pods which have app Jupyter Hub and component single user server labels. We can go further and use the Jupyter or slash user name label. What this will ensure is that this particular QBarmer policy only gets applied for this user. However, currently what I'm going to do is apply this to all of my users. So let's take a look at the policy itself. You can see the policy is divided into rules. The things that we monitor there, file accesses, process executions and network. All of them are rules here and there's an action. This action is a global level. You can also specify action at the per rule level as well to have more granularity. So let's observe what this file rule is trying to say. What we say is allow any directories, any file accesses in directories, user local bin, user bin and slash bin for these directories to be read only. So this is one. And we also say that allow slash user, slash local, slash this path. This path is basically the path where Python installs it global packages. So we'll make this read only as well. So users will only be able to read it and not able to write it from it, write on it. Now you might be wondering that Jupyter Hub, the environment that Jupyter notebook runs in is not a privileged user anyways. It's the Jovian user. And by default, it doesn't have any privileges. So using the Linux discretionary access control or DAC permissions on these files, user won't be able to write to these anyways. So how is it going to make any difference? The thing is that in certain use cases, your user might require pseudo privileges or root privileges or a user might be able to exploit some vulnerability and based on that, get privileged access to the environment. So we don't want to take any chances here. What these rules will ensure is that when Kube Armor is running, even the root user won't be able to do any of these actions. So this keeps us safe in all the manners. Now, this is another rule that we have specified so that other file accesses that are relatively safe are allowed in the slash directory or the root directory. So this was the five rules. Now let's move on to the process rules. What the process rules say here is that match these directories, user local bin, user bin, slash bin, and the user should only be allowed, allowed as the global rule says here, to execute from these paths and no path outside these. So the scenario that we discussed that a user was able to download a binary in their environment and execute it, in their home directory and execute it, that won't be possible anymore. And similarly, this last network rules say that user should like only the Python programs should be able to access network and no other programs should be able to do so. So let's apply this policy and understand what QBAR will do here. Since it's a Kubernetes resource, only a Qubectl apply will get our token. Now that my policy has been created, let's use a command to observe. What I'm going to use is skiermer logs. Skiermer logs will stream all the logs from my relay server and present them to me. I'll use the JSON flag to see all them in JSON. You can also get them in plain text and I'll beautify it a bit. So what this will do is get all alerts of all policy violations. So when I execute this now and there's a catch in it, what will happen is the user is able to execute the binary and I get an alert here. You see the action here is audit. So what does audit mean? Audit means that you only want to get the alert and don't want to enforce on the user. This is the default posture or the zero-trust posture of QBARmer by default. You can change this with a symbol annotation. How you can do that is Qubectl, I'll notate in this and just like we had earlier, QBARmer file posture and we set it to block. What this will means is if any of the policy violation occurs, QBARmer will not only audit it, but it will also block it. So the user will get a permission denied there. Now I'll apply it. And let's try executing again. You'll see that I'm not able to do so. So just like this, you have protected the environment from any malicious execution and I also get an alert here that my user tried to execute something, tried to execute home job in exploit binary and while doing so, the action that took place was block. The result was permission denied. This matched the default posture that you have set. And I can gain further insights in which user it was by the owner information about the workload that QBARmer sends. So basically it was a pod, which was of the name Jupyter user one running in the Jupyter Hub group one namespace. So this will help me decide that user one is either compromised or is not acting well. So I can take action accordingly. QBARmer has done its job of saving you at the moment. And this is where QBARmer shines. QBARmer alerts you as well as blocks you with something called inline mitigation. We'll take a look at it later. So that's another thing that we have done. Now we also might want to verify if our regular programs are running fine. How I'm going to do that is by running a simple Jupyter notebook which runs some Python code. I have it already, so I'm going to upload it. And now that I have it, let's execute it. So you see that I'm still able to install packages. The packages are getting installed in my local directory and not in the global directory. And no module named matplotlib. Jupyter sometimes failed to detect new package installations. Let's try to execute this again. So now we executed it. Basically what we did here was import matplotlib, imported NumPy and then wrote some sample code for plotting this simple graph using random values. So you see I'm able to do my regular operations pretty well as well. So no user behavior is changed. Now, another thing to notice here, we have been getting the alerts and all these telemetry data in the CLI itself right now and in the UI-based dashboard. However, you might want to have a richer interface for these. So for that, Kube Armor Relay has these integrations which you can use and these are very pluggable. So you can either use any external SIM tools like Fluendi or Sentinel and export the visibility logs there or you can use tools like open telemetry. So basically Kube Armor has this open telemetry adapter which gets the logs from Kube Armor and exposes it in open telemetry format. We have a sample dashboard which is consuming these logs and showing them on a Grafana and Loki dashboard. You can also use a Kipana dashboard that's been created by the Kube Armor community. There's an entire repository for these dashboards. So this is all possible to be done using UI as well and it's been done in an extensible format so that you can create your own dashboards. Now let's go further and take a last look at another thing. So the policy that we created just now was based on the observations that we did. However, there are frameworks like MITRE and CIS, NIST and they recommend some best practices for hardening your environment. So Kube Armor has this repository, community created repository where we create policy templates of all the policies based on the best practices recommended by these frameworks. You can check this repository out and you don't have to manually decide what all policies do you want for your environment. There's a handy command for that called the KRML recommend command. I can use it and I can specify a namespace while using it which will, what it will do is go through all the ports that are present in my namespace and recommend me some policies for protecting them. So that ensures that I'm keeping up with the best practices. So the namespace is going to be Jupyter Hub Group One. Now what this command is doing is it's scanning all the images that are present in my environment or in the Jupyter Hub Group One namespace and accordingly taking in policies from the Jupyter, the policy templates repository. So you see it has created all these policies where you can, according to your board, apply them and it explains what these policies are for. They are dumped on your file system and I can apply them as per my wish. So that's for the demo and let's proceed further. Now that we have seen Q-Barmor in action, let's take a look under the hood and find out how Q-Barmor does it all. Q-Barmor has these two components at the kernel level, namely the system monitor and the runtime enforcer. The system monitor is responsible for booking EPPs programs onto certain CIS calls and kernel level functions so that it can observe whenever they are triggered. And the runtime enforcer is responsible for converting Q-Barmor's policy to LSM native rules. Together, these achieve alerts and enforcement. However, in the user space, Q-Barmor also integrates with Kubernetes APIs and container and time APIs so as to get all the metadata about your workflows. And Q-Barmor puts them together and ships them in form of alerts and telemetry so that every user can make sense of the kernel elements that are coming in through the association with their workload. It's important to note the approach that Q-Barmor takes to achieve runtime enforcement. As mentioned in the demo, it is different for Q-Barmor. Generally, the approach that is taken by enforcers include killing the entire process or container in response to a threat detection. This is also called post-attack mitigation. However, this is not optimal as it gives attackers sufficient time to cause enough harm on the system already. Also, this leads to some or other sort of downtime of your applications as well. So how does Q-Barmor handle it is through LSMs. Through LSMs, we are able to achieve something called inline mitigation and with inline mitigation, it is possible to alert as well as prevent it before the threat executes due to the parts that LSMs take. So to summarize, Jupyter notebooks are essentially remote execution toolkits and it is essential to secure the infrastructure that they run on. The Jupyter community has suggested to have some sort of protection at the kernel layer and this is where Q-Barmor shines. So check out Q-Barmor, deploy it and do join the Slack for any feedback or queries. Thank you.