 Hey everyone, thanks for coming. Good morning. Today I'll be talking about Drupal Defense in Depth, which is a security framework for Drupal at scale. We've got a few items on the agenda today. We're gonna start out with some introductions. We'll step through the five key phases of the NIST cybersecurity framework, and then we'll just look at summarizing and next steps. To start with, who am I? I'm Ming. I'm a DevOps engineer with Salsa Digital. I've been working with governments and enterprise Drupal deployment since about 2020, mostly using Amazey's Lagoon SAH platform. Some notable projects I've worked on include the Victorian government single digital presence. I've done a little bit of work for GovCMS, and I also work on Salsa's internal hosting platform called Salsa Hosting. So before we get into it, we need to go through a couple of concepts. So the first of which is defense in depth. What is defense in depth? It's a layered approach to security, and traditionally this would compose of physical, technical, and administrative controls. But as we know with the move to cloud computing, the physical security layer is often taken care of by the cloud provider, and it's not something that people in Drupal DevOps generally have to be concerned about when managing the security of our Drupal sites. So we'll really just be covering the technical and administrative controls here. Salsa's defense in depth strategy consists of seven layers going from bottom up. We've got infrastructure, container hosting, application, edge protection, content delivery, people, and process. For this presentation, just to keep within time, we'll be focusing mainly on infrastructure, container hosting, application, people, and process. This is because organizations usually directly control these elements of the hosting stack. Our strategy is also built around a more containerized hosting stack using Kubernetes. Secondly, what is the NIST cybersecurity framework? It is a highly respected and adopted cybersecurity framework in the United States. Unlike a lot of other frameworks, it's not a certifiable technical standard, but it's highly flexible and it can be adapted to a wide variety of technical stacks and unique threat landscapes. It contains the five key phases we covered earlier. Identify, protect, detect, respond, and recover. So what's our strategy here? How are these two concepts related? For one, defense in depth is an effective strategy to contain many attacks. If attackers manage to compromise a single defense, the effective blast radius of their attack or exploit will be limited by the other layers of protection. And secondly, the NIST framework acts as a stepping stone to begin a journey to more stringent certifications. Many of the practices and recommendations from the NIST framework are echoed in these stricter standards. So aligning a defense in depth strategy with the NIST cybersecurity framework fortifies layers of a defense strategy, while also helping to streamline the path to compliance for more advanced security standards like ISO or PCI. So starting off with the first phase of the cybersecurity framework, identify how can we apply it to our defense in depth strategy layers? This involves identifying assets that may be at risk. So for our infrastructure, some infrastructure components that might be targeted include computing infrastructure such as web servers and working nodes, in addition to networking components such as load balances. Application, the vast majority of vulnerabilities in Drupal sites come from contrape modules, themes and libraries. It's important to keep track of those that are in use and monitor the Drupal security advisories for disclosed vulnerabilities. Container hosting, part of your container hosting or orchestration infrastructure that might be particularly exposed include things like the Kubernetes API server as well as components of any observability stack and use like Grafana OpenSearch dashboards. Other cluster management APIs in use should be considered. For example, the Lagoon API for clusters managed by amazing.sh is also exposed publicly. People, with the level of sophistication of phishing and spear phishing attacks these days, as you know, personnel are often the weakest link in the cybersecurity strategy. And finally, process. Although process isn't something tangible that an attacker can target directly, outdated processes are a risk that could negatively impact the implementation of latest ages of our framework. Protect. How do we protect our assets? We need to have a comprehensive security strategy for each element of our technical stack. So, starting up with infrastructure, since we're most likely in the cloud, we want to leverage cloud provider concepts. For example, we want to make effective use of things like network policies and security groups. We want to rely on things like cloud provider managed operating system images because these are often configured with best practice in mind. We also want to look at things like regular rotation of our worker nodes to ensure that the operating system is up to date. This can be done manually or via something, a tool like carpenter where you can actually schedule your worker nodes to have a finite lifespan, and then they'll be removed and replaced with a more up-to-date version. Application. For Drupal, we want to have strategies like configuration management and auditing to ensure that configuration is in line with your security policies. At Salsa, we have a tool called Ship Shape, which is free and open source that we use on client sites to ensure that any code being deployed has configuration that's in line with their policies. Regular and automatic patch management for core and contribe. You want to have regular and automatic patch management for your sites to ensure that any security vulnerabilities, of course, patched out. Although Drupal has very good security, we can extend that with things like security modules, password policy, username, enumeration, prevention, login security, the TFA module. These provide a host of additional security features for Drupal. And of course, if possible, we want to rely, we want to switch to static content. So if you're a Drupal site is not too dynamic, you can host a static version of it. And we can use tools like Tome and Quant CDN, which would vastly reduce the attack surface of your application if it's just a static representation. Container hosting. We can put things like our API server on the private network so it's only accessible via VPN. Same for Grafana and logging dashboards, for example. You can use an IPS to detect connections to certain block listed addresses, protocols and domains. And you can protect your point of ingress. So for example, you can apply more security right on your ingress controller. So no matter how someone gets to your site, it's being protected by a WAF. People, you want to have things like proactive security training, role based access control, according to someone's job function. Drupal has a very robust access control system, roles and modular permissions. You want to do things like mandate the use of password managers and two factor authentication across the board. And of course, your process, you need to instate a security policy that enforces all these standards. We've listed above from, you know, your application configuration to network policies to use the access. Detect. How can we detect a breach of someone's attempting to abuse our cloud infrastructure? So to start with at the infrastructure level, if your cloud use is fairly stable, cost and usage alerts, for example, are a useful identifier to determine if there's been a breach, especially if attackers, you know, use the access to try and spin up a cryptocurrency miner, for example, that would really spike your cloud bill. There are also tools such as Amazon GuardDuty, which can proactively monitor your instances and outgoing traffic to identify any breaches or potential breaches. At the Drupal level, you can implement things like the login security and the security kit module. So Drupal has pretty robust protection against brute forcing of accounts. But if someone is attempting to brute force multiple accounts, for example, on your site, the login security module can actually detect that and alert you proactively. And with the security kit module for browser security, if someone has managed to compromise an account on your site, they often try to embed their own content, their own images, scripts and iframes. With the security kit module, you can configure a content security policy so that if someone attempts to access a page on your site that has malicious content, it will load the browser will refuse to load it. And you can even specify reporting endpoints. So if someone's browser has encountered content on your site that's against the security policy, it can actually report that to you. So you know that you know, there's some stuff on a page on my site that's against my policy. Container hosting. At that level, we want to have things like centralized application logging and monitoring so that logs from Drupal and PHP are aggregated and stored in one place. This allows your operations and support team to create alerts, monitor for events that might indicate a potential issue or a breach. Another advantage of this is that an attacker would be unable to tamper with any of your collected logs. Even if they've completely breached the application, assuming they've gained code execution, they wouldn't be able to touch a log and you'd still be able to build a trail of where they've been. People, your content authors should be trained to spot out unusual or suspicious content or activity on the instance and actually be encouraged to report it. Employees should of course be actively looking out for phishing threats sent to their emails and be proactive about reporting those as well. And most importantly, your process. You want to have automatic alerting and proactive detection in place. You can achieve this through things like an IPS or Prometheus alerting rules. So if you know there's something you have to watch out for, you can create rules for these so that your team is notified ahead of time of any potentially suspicious activity. Next phase, respond. So there's a famous saying I like. It's not when you get hacked. Sorry, it's not if you get hacked, it's when. So we need to have a strategy in place to respond to breaches. So at our infrastructure level, for example, we could, in the event of a DDoS attack, we could do things by, for example, create a security group on our load balancer to drop incoming attack traffic. This is as opposed to, you know, blocking the traffic at a container hosting or web server level via like an engine X and ingress rule. This is so that we can leverage the filtering that's performed at the cloud provider level to help alleviate pressure on our origin infrastructure. For our Drupal application, most of the time, unfortunately, the only immediate response that we can take once our application is breached is to just take it offline. However, if you've actually taken a static snapshot of your site beforehand, it can serve as a good fallback. Although if your site is very dynamic, the functionality may be reduced, but your site will continue to be accessible in some capacity as opposed to completely offline. At this point as well, you'd want to look at initiating the restoration of backups while your static snapshot keeps your site online. At the container hosting level, you've got things of course like version control and container images which are immutable so that, you know, this means that your deployment, no matter how badly damaged it is, it can be instantly restored to a known good state. I mean, it's important to note that at this point your site is still vulnerable. It's not patched yet, but it can be very quickly reverted to, you know, the last version, known good state. We can also make use of our centrally collected logs and metrics to perform a root cause analysis. People should have clearly defined roles and responsibilities, and all your documented and hopefully practiced disaster recovery plans should be executed at this point. Recovering from a breach. At the infrastructure level, we want to look at things like recycling nodes, for example. Although containerization should have actually protected our host node, there's still the risk of container breakout, and recycling will instantly return all your worker nodes to their original condition. Of course, before we can put our application, our Drupal application back online, we need to ensure it's patched and updated so it doesn't just get breached again. We can use information that we derived earlier from our root cause analysis, using our collected logs and metrics to do this. And once that's done, once it's patched, we can bring it back online or cut over from our static version. At the container hosting level, it would provide backups using your chosen platform backup solution. One example of this is something called KITUP. These backups are completely decoupled from your application, stored off-site, encrypted, to ensure integrity. So if someone's breached the hosting environment for your Drupal site, they can't touch your backups, they're completely separate. People, your staff should be clearly informed about the cause of any breach so that they understand why and what they can do better next time. And of course, at the process level, we should update our plans and processes like our DR plans to identify any, to address any weaknesses that, you know, may have been identified during this whole process. So those are the five key phases of the NIST framework. Let's step through an example scenario. How would, how would this have held up in the case of CVE 2018-7600? Some of you might notice, it's a Drupal G2 vulnerability. How would this have protected us in the case of a mass exploitable vulnerability, like Drupal G2? Let's explore the process of potential attacker would have taken in it to attempt to compromise a Drupal site that was vulnerable to this exploit. First off, you know, they would have checked to see whether the site was vulnerable in the first place by attempting to perform a command injection. They would run the exploit and force the Drupal site to run something like the who and my command. A WEF product, such as mod security, a QuantWef deployed at the edge or at our ingress controller, would have easily picked up and blocked the request, as command injection is a it's a very common attack type that these WEFs actually check for and block. Secondly, let's say that day mod security decided to go and leave and it wasn't working. And let's say the application, the attacker's command injection exploit went through unchecked. The next step might be to say set up a reverse shell to attempt to gain access to the site's environment. This is often done on an arbitrary port, you know, something like 4444 or 7777. An IPS solution, such as Falco by Cystic, deployed at your container hosting level, would have detected, you know, an anomalous outbound connection why is Drupal trying to talk to something other than 80 and 443. That's quite suspicious, it would be able to block that and potentially alert your team as well. And finally, let's say all our technical protections were on snooze that day, none of them were working. Assuming the processes we discussed earlier were in place, operation staff would have been alerted to the fact that a new vulnerability was in the world and sites should be patched and defenses tweaked to mitigate the vulnerability of any hosted sites had been compromised at that point. Centralized logging would have allowed you to investigate and pick up on that very easily. For example, you could create an alert rule that scans incoming traffic for that pattern, the exploit pattern, and be alerted on it proactively. So let's leave this. To summarize, the NIST framework is a really great security tool. We've walked through the five steps in the framework and applying the defense and depth layers to them. This provides an effective a provides an effective defense strategy while being in compliance with a well-respected security framework. And what's next? It's time to start defending our Drupal sites. I hope this gave you something to think about. Does anyone have any questions? Thank you for the presentation. This is a quick question. Should we include reporting as part of the process? So in case, like, you know, huge government agency site or sites which have, you know, personal information and stuff like that, thousands of customers, for example, and assuming that, for some reason, it's vulnerable and got hacked, as part of that process, should we include reporting to, I mean, government agency or is there something we should report to? Yeah, if your site, if one of the hosted sites has been breached, you're often legally obligated to actually report that breach to your client or a government agency. It depends on your actual location, but you're often obligated to. Do you have a list or top three Drupal modules that you always install for security or security oriented modules? Yeah, yeah, so my top three are the security kit module, the login security module and username enumeration prevention. I also have a module on Drupal.org called security pack, which pulls in a bunch of popular security modules and installs some pre-done configuration for you. So I think that's a good, like, one-click solution to helping secure your Drupal site. I think, does things like, you know, enforces a stronger password policy, installs username enumeration prevention, sets up some brute force protection? Thanks for the presentations. I have questions here. Yeah, thanks. I have a quick question about the snack shop or backup. Yeah. If in a situation a site is a compromise, but in a short period of time, we just have no idea where it's going wrong, or normally we roll back to previous snack shop or backup from, as I said, yesterday, but we realize on local testing, we realize it's still not working. The compromise happened way before yesterday. Do we have the capability to compare the Snapshot backup to to try to identify where this issue comes from? Yeah. So in that case, if your site is actually being compromised for some point, that does make things more difficult. But you'd have to identify the point at which your site was initially compromised. So that's where those collected logs and metrics come really handy. You could find the pattern of the account that got exploited, look at login requests, look at general HTTP requests, write up a query to actually find those and find the oldest match and any Snapshot before that should be good. But that also plays into things like how long are you retaining backups for. So you could be in trouble if your site has been actively compromised for six months and you're only holding three months of backups. That would be a quite a bad situation.