 Good afternoon, everyone. Thanks for coming to my talk. Today, I'm going to talk about security in modern apps with zero trust and next generation web application firewall. I'm so happy to be here in this conference. I came with my entire family. So that's cool that the conference allows you to come with your family and your children and have child care support. So I appreciate that. Before we start with the topic, let me introduce myself. I am Jose Carlos Chavez. I am a software engineer at The Trade. The Trade is a service mesh company. We do security features on top of Istio service mesh. And I am an open source enthusiast. I have been doing open source for seven years now. I really enjoyed it. I'm an OWASP, Coraza WAF co-leader. This is a WAF project at OWASP. I'm a SIPCM core member for those who are in the world of distributed tracing or observability. And I'm also a loving father. And I am from Peru. So web application firewalls, WAF for family and friends. What is a WAF? Well, traditionally, a WAF helps you to protect web applications from incoming traffic from the internet. It protects you from known attacks like cross-site scripts, cross-site forgery, SQL injection, this kind of malicious traffic that I aim to compromise your systems. Usually it's deployed in the layer seven as a reverse proxy in front of your servers, listening and monitoring all the traffic and deciding whether this is a malicious request or not. And the proxy basically does the analysis and then decides whether this request goes upstream or not. This is how you traditionally deploy a WAF. There is the outside world with mobile applications, web browsers, devices, legitimate users, and then attackers as well. They go through a firewall in front of all your servers. And then if you see it, some sort of becomes a bottleneck in the middle. So evaluation can be that expensive. It cannot be that expensive. Also, as you see, you have it decoupled from the server. So any knowledge about the server is an overhead. But yeah, in the old days, you have the WAF like this as every other firewall. Some of the features are IP fencing, which is you can deny a specific IP through a deny list or allow a specific IP. You can do geo fencing and geo blocking based on geo IP databases. Basically, you can create a virtual perimeter and say, OK, requests from this region or this country can go through. Requests from this region or this country are denied depending on the geo IP database, right? Another quite useful but complicated feature is request response inspection. Basically, the WAF will buffer your headers and your body and the query strings, run analysis and examine them, and compare against no malicious strings and then decide whether this is a legitimate request or not. It's useful because it allows you to prevent zero days attacks if you remember log for self, for example, which was an attack through a query string parameter. Well, WAF could prevent these kind of attacks when you know what the problem is or how the attacker is approaching your service but you don't have the solution yet in place. It avoids client-side attacks, of course, both attacks, various files, et cetera. Then you have a security rules, which is a curated list of rules that will block known attacks like SQL injection, cross-site scripting attacks, local and remote file inclusion, remote execution, common injection, size restriction, these kind of rulesets. If you have heard about, for example, a WASP code ruleset, it's a curated list of all these rules for preventing these kind of attacks and they can be deployed in a WAF. Anomaly scoring is another interesting feature where you will analyze the traffic and then if a rule matches, depending on a misspelling, for example, in the URL, maybe someone trying to attack your server or if they're actually trying to run an SQL injection, you will give a score and then based on the threshold, you will decide whether I should block this request or not. It's a well-known feature where you will block a request or you will only allow a certain number of requests in a window of time. Right? I guess the most, let's say, the most famous use cases blocking by IP but you could pretty much do on every other input, right? It could be user agent, it could be IP, it could be, yeah, well, those are typical use cases. And then you have both mitigation which is basically analyzing the cookie sent by the browser and then check them to find whether they are legitimate bots or not, right? Because there are different kind of attacks when you have a bot, then you launch a CAPTCHA challenge or you can have a bot pretender which is a bot that pretends to be a good bot like those index, those bots that index your web and web scrapping protection and so on, right? Those are the features that are from the classical WAF, let's say. And I know security has become like a strong concern these years but before it wasn't that hard, security was easier, everything was easier, I guess, but security specifically was an afterthought. There was a time when security was just about, I need more security so I just deploy a new WAF and then I feel protected, right? Unfortunately, now the way we deploy applications and the way we operate applications and the amount of dependencies we are tied on when we build an application are so big that this is not anymore valid. And the reason is because the traditional WAF was focused on perimeter security, right? Basically, when you were inside the network, inside the perimeter of security, then you were safe, right? You will even have unencrypted traffic because, yeah, they are deployed in your servers, right? So why would you not trust them? And the problem at the moment is that with all this cloud native madness that we live on and because of the scalability, the scalability needs that we have now and the way we deploy stuff, if you think about deployment nowadays, it's not something that you could draw a perimeter around, right? Because there is no single easy identifiable perimeter because you have now cloud, multicloud, on-prem, third-party services, lambdas or functional services, artifact registries, et cetera. So what would you draw their line whether this is trustable or not based on a perimeter, right? Also now in the time of microservices, a request traverses those enough services when you do a call, for example. So the majority of the traffic is West, right? Across fellow workloads, whereas traditional WAF and perimeter security is more like a single getaway and you receive all the traffic from the internet to the internals. So yeah, that's problematic. Let's say the more of the traffic happens inside your network, not from the outside. And then what I was mentioning before, like you have getaways in front of your services and they have to have a lot of knowledge about internals because they have to protect the PHP applications, the Go applications, the Java applications, deploying tons of rules for different languages and having to know a lot about implementation detail of your components, right? So that leads to operational complexity, misconfigurations and you need to have an effective way to deploy all these configurations in a timely manner and all that, right? So it becomes much more complicated. If you read security reports from the last year, one of the biggest source of vulnerabilities in the systems are misconfigurations. So we should be avoiding this. And then finally, the guiding principle of perimeter security is trust but verify, right? But that means that you will first trust and then verify when the attacker is already inside, right? And this is basically crazy now because the attacker is most of the times inside already, right? Then is when zero trust erases, right? Started by NIST, zero trust is enabling the right user under the right conditions to guide the right permissions to the right date, right? Zero trust is now very popular. Everyone talks about zero trust. Everybody sells zero trust. You might buy lots of zero trusts. So let's give a step back because you probably hear about a lot of tooling about zero trust frameworks and services, blah, blah. But let's give a step back and think about the definition. It's a term for an evolving set of cybersecurity paradigms that move defenses from static network-based perimeters to focus on users, assets, and resources, right? Basically, every actor in the communication internal of your deployments becomes a first-class citizen, right? This is the definition from the regional papers, zero trust architecture from NIST. And what are the driver assumptions? Like what is the shift in the mindset that turn into this kind of approach? Well, trust can no longer be based on a network perimeter as perimeters can always be breached and will be breached for sure. Policies have to be defined based on the assumption that the attacker is already inside of the network, right? That's opposite of what we saw in perimetral security where we first trust and then verify, right? Here we are more like in a paranoic mode where we always not trust anyone. All access decisions have to rely on least privilege per request and context-based principle, right? And no, and on identities associated with users, services, and devices, right? You have now user accounts, service accounts. So all the actors participating in your model in a request-response model or whatever, are first-class actors. They should be identifiable. We should run security assessment on all of them and we should be able to know which one is and what permissions they have, right? It's not anymore like, okay, I'm logged in or logged out, right, which was back in the days. And security and access state constantly change over time, right? So this is a shift in how you see your system in terms of security, where before you have an assessment of security and it was static. Now it's more like over the time, when my service is not secure, right? When my deployment is not secure because from one minute to the other, a new CD could be disclosed and then your system is not secure anymore. But nothing changed, right? Your system is the same as one minute before but then the state has changed because of external conditions, because of the internals of your system, right? So if you were granted permissions yesterday, that doesn't mean that you will grant permissions today, even with the same payload, with the same request, the same service, with the same actors, right? These are the driver assumptions. Then we have the tenants. First, well, tenants are principles but not in a dogmatic way, more like in a consensus of a group of people. All data sources and computer services consider resources, as I said like, before you had like, your deployments were basically a rise, an array, or an arrangement of servers and endpoints. But now you have more dynamic elements. You have scaling groups, you have lambdas, function as a service, this thing that in a non-deterministic way will be deployed in your system at some point, doing something. But not necessarily you have control of them in terms of security because they will have to get access to your components still, right? They might have specific permissions to your resources in your environment. Second, communications are secure regardless of location. This is the main shift from perimeter security, right? You have access policy, which are by default to deny and then you have to grant specific access, like questions like, why am I accepting this request from these other service? Although they are in the same cluster, should I trust them or not? So this service in the other, when you have a tenancy model for example, is this service deployed for this tenant? Should I accept request from? Should I grant permission from? These kind of things are new questions that you should ask yourself and then define the least privilege based on that. Then access to individual resources is granted on a procession basis, right? Permission shouldn't be extended beyond a session, right? This is what we were talking earlier about. Okay, the fact that you have permissions, now doesn't mean that you have the same permissions or you will grant the same access in five minutes because security might change over the time because even though the rate is the same, the input is the same, context might have changed, right? Policies might have changed, deployment might have changed. So all these things should be considered. Access to resources is that they're minded by dynamic policies and other behavioral and environmental attributes. That's what we were talking about. Like access policies now should consider a lot of not known or unknown attributes that come into play. You cannot anymore define a single model with these are the fields or these are the attributes that I should consider. Like now there will be a lot of attributes that you might not have beforehand that will be part of your network at some point. So all these policies should be let's say elastic enough or flexible enough to accept this kind of new attributes at some point. Monitor and measure integrity and security posture of ONET unassociated assets, right? This is very important because we say that you should learn from your errors. Well, you should learn from your logs as well, right? When you have added logs about access or denials of access in certain endpoints, you should be paying attention on what happened, what triggered that denial or what triggered that access or what triggered that error that made your service behave differently. And then based on those information to be able to provide new policies that will cover these potential risks, right? And so you should be constantly recording logs and then auditing them, monitoring changes and then you will be able to learn from that to create new policies to improve the security assessment in your system. Dynamic resource authentication and authorization is strictly enforced before access allowed. Yeah, this is a key principle or a key point in Zero Trust that every access should be audited by a policy enforcement point. Actually, in the next slide, we will see the diagram where you have a policy enforcement point in front of every service that is analyzing whether can you access this, can you access this, right? And that will connect to a policy decision point which holds all the knowledge and all the information about which component has access to watch or what are the policies, right? Granting access and trust is occurring in a dynamic and ongoing fashion, right? So you should be able to analyze on every time. You cannot, let's say, access decision can be cash but shouldn't be cash or should be cash under certain conditions with certain policies. But you cannot just say like, again, if I grant you access before, I will grant you access now. Collecting from your current state of asset, network infrastructure and communications to improve security posture, right? This is more or less what we were talking also before, like you should be monitoring everything in terms of security, what are the error rates, what are the, let's say, the anomalous scores that you collected, what are the things that changed in your system that you should be aware of and you should be analyzing to determine is this something, let's say suspicious, is this something malicious? Should I take an action? Should I create a new policy, right? This is what I was talking about, the diagram, which specifically here at Trust Talks about a policy decision point and a policy enforcement point that are those that are guaranteeing that your resources are protected. There is, here it says untrusted zone, implicit trust zone, but implicit trust zone is very local, like it's basically the workload, right? Because you're not trusting anyone inside. If you think about Kubernetes, for example, inside the pods, the sidecar and the actual container and maybe another sidecar, they trust each other because they are part of a logical unit of work, although they are three separate applications, but that's the implicit trust zone. Outside the pod, a pod from another name space or whatsoever, that's untrusted zone, right? So then, why if zero trust crushes all these perimeter security, we are still talking about WAF, right? Because we said already that it doesn't work and that we should throw away everything around security in the perimeter. Well, in life, I like to say that less is more. That's my mantra. In security, less is less and more is more, right? The less you have insecurity, the less protected you are, the more you have insecurity, the more you're protected out. That doesn't mean that you won't get caught, but you have more chances to be protected. So web application firewall is still a valid thing in the zero trust days. And pretty much other security measures like VPN or other kind of firewalls, perimeter security is still a thing, right? Zero trust is not incompatible or the enemy of perimeter security or network security. They are complementary, they are just different layers. So if we look at the, for example, the tenants that I just mentioned, there are two that are specifically feeding the WAF in the WAF law firm, right? You have the integrity and security posture. Every resource request should trigger a security posture evaluation. That's what a WAF does. Every time there is an incoming request, I will analyze the header, the body, the URI and then decide whether this request is legitimate or it's malicious. When you identify an attack, apply network patches and vulnerability remediation. That's also important because once you detect that you are in a vulnerability, it's been exploited based on different factors, right? Based on that your services are now throwing lots of 500 or based on that there is now traffic from one service that wasn't there before, right? Suddenly one service started reaching another or because you have things like Falcon and then you see a lot of course from one service to the internet, right? Then you put in places or put in place new policies that will block that. First you will probably recreate a service, kill that service or even kill that machine because it's compromised already and then put policies in place to mitigate that problem, right? Sometimes you even not fight the root cause but fight the symptoms, right? Start blocking access rather than until you can figure out how this guy got into my system. And then collect the info on current state of communications, right? You should be doing continuous monitoring from the audit logs and traffic and then improve the security posture by right limiting is one option or put new policies, put new security rules that will block this malicious traffic. So your application firewall is still fitting here in zero trust days and then indeed there is an opportunity. So how a zero trust application firewall would look like, right? Well first it should protect workloads by filtering and monitoring traffic between within board logs at the policy enforcement point. Then it should protect workloads from attacks from the outside like cross-site forgery, cross-site scripting, as I said, file inclusion, SQL injection, the typical ones. It should leverages wide network patches for zero day vulnerabilities, right? Sometimes you don't, you know that there is a vulnerability, you don't know how to fix it or even if you even know how to fix it like the, let's say the trip from the vulnerabilities fix at the library that library is fixed at the consumer and that library is fixed at the consumer and so on until it gets your system is a long path, right? So you can just say like tomorrow is gonna be fixed or tomorrow is gonna be merged. And that's why with WAF you can deploy, you can patch your network basically. You can say like all the, like if you think about log for sale for example, it was in a query parameters, you could say like, okay, I will block all the requests that the query parameters follow these rigs and then you don't solve the root cause of the problem but you have time to wait for it, right? It should allow to onboard legacy applications in a lift and shift fashion. Lift and shift fashion is basically getting your old legacy application and, oh sorry, getting your old legacy application and putting in the cloud, in the cloud, right? No changes because probably it's a legacy application you haven't touched for years or you don't even know how that works or you don't even know how to build it, you don't know the language, you don't want to update anything, you don't want to rebuild. So you just move it and put it here and then you need protection, right? Because that application maybe is vulnerable already. It was just that you have internally in an internal network and now you will put in a cloud where it's external and server somewhere that you don't know and people could potentially grant access there. So you put a WAF in front and then you feel safer. You have, you mitigate the risks, right? And that's very popular nowadays, right? Imagine a bank moving to cloud, right? They are not rebuilding whatever cobalt they brought. Flexible rule set based on application internals because as we talked about before, we said like, okay, we have the gateway that protects all these servers, but some of them might be PHP, some of them might be Go, some of them might be Java and then they get, we have to conflate all these and put security rules for everything. That's, as we said, it's problematic and it will probably add a lot of latency because you will have to evaluate a lot of rules. If you put in different workloads you said, you put different rules sets then it will be much easier and much more easy to maintain and not much easy to understand and digest, right? And also provide audit logs for further analysis and improve security posture through adaptive rules, right? If you run anomaly analysis on your audit logs, you could find new rules arising, new attacks, right? And then you can create new directives that will be back to the application firewall and then it will be a security measure, right? In an adaptive way, right? Yeah, and as we said, like there are a lot of people doing lift and shift, people in service mesh is requesting for web application firewalls, PCI compliance, which is the standard for the electronic payments is also requesting for a WAF. So there is a lot of opportunities for WAF in the cloud days, right? And this is why we believe that WAFs also plays a role in the zero trust days. And now I will briefly talk about this open source project that I work on, which is a WAF project. First of all, this open source, so yay. It's inspired in more security. And you might know that more security is going in of life in July, 2024. So KORASA is a good candidate for replace because it's compatible and supports Eklang. It's also focused on the Waspcore rules that before, which is like the newest version. It's gonna be released very soon. It has multi-platform connectors. It supports native Go, of course. Caddy, we have a connector for hub proxy. Then it supports Envoy, Istio, KONG, using AP6 as well. And they just released a new version supporting proxy wasm as well. Proxy wasm is a standard for web assembly and proxies. And it's fully compatible with web assembly. That means that you could run in different platforms without this Go library and this Go engine in different platforms. For example, KORASA Playground, which is a website where you can test rules, is running entirely in the browser because we're compiling in the web assembly and then run in the browser through web assembly. So that's also open up a world of possibilities. Plugable architecture, we have a plugins API for extending functionality. A month ago, one student from Google Summer of Code wrote a plugin for rate limiting in KORASA. So that was cool. And it's focused on high throughput and performance, because as you might know, when you have the WAF in your gateway or in your ingress, then it will only happen once. And then performance is, of course, a concern, but it's something you could accept. But when you run on every single warp row, then you need that performance is really good because then you will add latency every time, every hop. So it's performance driven. It's focused on memory consumption, on CPU consumption. And it's aimed to be running the critical path because it will appear in the critical path many times. For example, in the policy enforcement point. And finally, conclusions. Well, first, Serotrust isn't incompatible with other network perimeter-based security approaches. As I said, they don't hate each other. They can work together and provide you more security. No single component or function will be sufficient to achieve good level of security alone, but they work collectively as one layer over the other to achieve protection. And then WAF still plays a role in the cloud days, that's for sure. If you have any question, I will be around. Thank you for coming. Thanks. Any question? I uploaded my slides to Speaker Deck and also the platform for the conference, right? Question? Yeah, I would say it's more like a trade-off, right? When you choose to run a sidecar along with your actual workload, let's say, you decided already that this is trustable, although maybe it might be compromised. Who knows? Maybe tomorrow they disclose a vulnerability in Envoy and we are done. But at that time, you can never trust that. It's like testing. You never finish testing, right? It's the same. You cannot guarantee that this is 100% sure or safe, but you make a decision that, okay, I will trust this. And then, yeah, there is no security measure between your sidecar and your workload, that's for sure. Because otherwise, the latency will be huge, right? Also, because of the nature of the sidecar, it has a lot of access to your actual workload, right? Or to your actual workload. So, yeah, you should trust, you should trust. You have no other choice. But that's why you call the implicit trust zone because everything there is implicitly trustable. But it's a moral decision. I mean, nobody can guarantee this is 100% sure. Safe. Any other question? Sorry, I couldn't. Yeah, I mean, you could use KORASA as a library and put it in a middleware in your code. Then there is another concept that plays, that comes into play, right? You want to leverage WAF as a wide organization policy. And then the easy way is to use a sidecar, right? Because otherwise, you will force, not force, but you will request that all teams will include this middleware, which, for example, here, KORASA is written in gold. What if you don't have gold services or you have different service, multi-language service? Then I think that's one of the key points of having service mesh, right? That you want to leverage wide organization policies for everything without being intrusive about them, without requesting them to do this or that, right? So teams can just focus on the service itself. You had a question? Why for the sake of WAF you need another software? Why just is not part of the Apache or NGENIX or, like, yet another model? Yeah, because, yeah, for example, more security was like that, yeah? You will put more security in Apache as a plugin. And that happened when Apache was the main orchestrator of your endpoints, let's say. Right now, what you have is that you have one gateway, but then in the days of Apache, when service A communicated with, or endpoint A internally communicated with endpoint B, the communication will be direct. With zero trust, you don't want that. You want that they both are encrypted, mutual TLS, for example, that the traffic is also analyzed, for example, imagine that you have service A and service B, and someone gets into the container of service B and starts doing CUR, or start modifying, if it's an interpreted language, start modifying the call to do certain malicious calls to service B, so you don't want that trust. And at that time, the security that you put in the main proxy is already gone. It doesn't come to play, right? That's why you will put that on every single workload, let's say, that's the main thing. Yeah, that's, yeah, that's, I would say when you have, let's say when you start a green field, and you decide, okay, we are using this language and whatever, then things are easier, but as soon as you move to, or you have legacies, or it's a bigger organization, or you have high throughput, so you need to decide, okay, for this application, I use this language, for this, I use Kafka Streams, for this, then everything becomes heterogeneous, and then if you don't have a consistent way to deliver these policies, then it's like not having policies at all, that's the main problem. Any other question? Thanks for coming, I really appreciate it. Thank you.