 Thank you all for attending this late afternoon session before the end of the day in a booth crawl. My name is Raymond De Jong. I'm very happy to present here about adopting network policies in highly secure environments. The agenda today is this is not a network policy 101. This is more like a talk how we see our customers dealing with adopting network policies in the in the field and how you can use, for example, Syllium, EVPF and Hubble, our Hubble UI and Hubble CLI, to basically observe traffic and adopt network policies in your classes effectively without blocking your applications. So I'll start with some strategies, tips and tricks and approaches to adopt network policies and then talk about Syllium features that matter for using that and looking at observability as a superpower to make sure that you're able to enforce security and observe flows to ensure your compliance. I've prepared a little demo so I hope the demo gods are with me so I can show a little example how you can use your Hubble, for example, to secure your workloads. My name is Raymond De Jong. I'm field CTO for EMEA at Isle of Valence. That means that we also work with a lot of customers both open source and our enterprise release who are adopting Syllium and EVPF in their environments to secure those environments as well. Can I see hands? How many of you know about Syllium? All quite about quite a lot. Okay, so how many of you know EVPF? Right, cool. And how many of you are already applying network policies in their environments? Okay, a little less hands. Okay, great. So that's the whole purpose of today to give you that tips. So Isle of Valence is the company behind Syllium and EVPF. We created Syllium. We open source Syllium. We're still contributing to the Syllium open source release. And we also maintain EVPF. I will talk about Syllium and EVPF a bit later. So let's start with having a bit of a talk about strategies for designing network policies. What we see is that a lot of companies struggle with adopting network policies. You want to avoid allowing everything. A lot of companies want to have a zero trust based approach. But in the meantime, you also do not want to block your application teams and you may want to make the onboarding of your application as easy as possible. But you also want to prevent application teams themselves to allow all traffic for their namespaces and going in and going out. So you need to have some constraints there put in in your system to avoid such a thing. You also want to, a lot of our companies also need to have compliance. They want to ensure and that all their flows are being enforced as such. So a default deny approach in practice is really hard to start with from the start. So there is no easy road for adopting network policies unfortunately. If you start with a zero trust, you most likely will run into issues with onboarding your applications and that will result in having adoption problems of your applications or friction with your application teams wanting to use your Kubernetes platform. So a better approach we try to use here is to focus on risk reduction. So that means that you need to define metrics for risk. You need to understand what's important for your environment, what kind of metrics you need to monitor and to apply your policies on and to also move to fine-grained security policies and focus on most sensitive namespaces and leverage network observability tools to move to more fine-grained serial network policies. And once the tooling and your policies flows are proven, you can repeat this across multiple namespaces and multiple platforms. So typically obviously application teams or multi-tenant platforms have one or multiple namespaces in a cluster or across clusters and have deployments which you need to secure. So it makes sense to have a focus on the namespace level. When we talk about network policies and securing those workloads, it's good to understand the four types of risk exposure. First of all, very obvious. If you expose a service to the internet, that's a potential risk, right? So you would like to secure and monitor that flows. Additionally, if you expose a service in a cluster, that can also be a risk for exploiting that service from external network resources. In a given cluster, by default everything is allowed. So that means that you have obviously wide blast radius you want to reduce. You also want to avoid data exfiltration for egress connectivity. So the goal is to reduce unused access for any access that creates exposure without actually being required. So that's also where the metrics are important. So we want to reduce the risk by reducing the wide blast radius. Again, your Kubernetes API may be exposed. Your namespaces are open by default. Any pod can reach any pod. So in any case, when a workload is exploited, that could lead to a huge risk for a wide blast radius. So our focus is to reduce that blast radius and isolate to start with from a namespace level. Also, with strong observability, you can match or see and observe any mismatches with your current applied policies. And what you want to minimize on that. So if you see a lot of drops, that can either be a risk in terms of an attack, but it can also be that something needs obviously some kind of access. And there are some examples, right? CubeDNS is a good example. You need access to QBDNS. Every namespace workloads need access to QBDNS. But a front service exposed through a low balancer is the only one needs to be exposed for on a port for 3, for example. But the actual service behind it, for example, RedisServer, doesn't necessarily need that. So avoiding that is a good best practice. And then you can also prioritize on the number of service you expose through either Ingress or Gateway API, because most likely those are services which are exposed on your network or even on the internet. And then also focus on services which are reachable within your cluster, across namespaces, and number of services with access to external resources, such as egress. Now there's also a risk if you do this approach of overfitting. And what we mean by that is that if you have a strong observability tool, you can potentially do a one-to-one translation for each flow to create a network policy. But what the result will be will be a very brittle network policy design, because when there is a little change in application or a new version of an application, that immediately means that a small network policy somewhere needs to be updated or created. And at scale, this will not be feasible. So our approach here is to start with an initial namespace policy and use global policies across your platform or even across clusters using cluster-wide network policies to define the card rails. So you can use, for example, cluster-wide network policies to secure for each given workload access to QtDNS to allow that service to be reached. Also, for example, Prometheus, that needs access to monitor your services. But on each namespace, you can only allow traffic within a given namespace and then use your observability tools to create flows, create rules, sorry, for specific egress or ingress connectivity. So when you use Hubble and Cilium, you can start with a default namespace policies, where you can observe traffic, and I will show that a bit in the demo, to inspect what is being reached out to egress and or ingress and secure that access. Then you are going to transition from this per namespace security with global policies as guard rails to more fine-grained policies. Obviously, focus on the most sensitive namespaces first and focus on the metrics you've put in, for example, using Grafana metrics to see where you have the most exposure, which services get hit the most, where is the most traffic flowing to those specific namespaces. For egress connectivity, maybe you want to narrow down specific FQDNs or specific ciders to allow access on a specific port. And for ingress, you want to expose connectivity for low balancer and only allow, for example, the ingress to create a given workload in your namespace. And then you will be able to move with global baselines again, which has already set, which set the global baselines for your cluster, but additional network policies per namespace for specific connectivity. So in this example, specific FQDNs with specific ports, only to be allowed access egress from that given namespace. Then use, for example, CICD pipelines, tools like Argo, Flux, your GitOps pipelines, at such to manage all those network policies. At scale, as a platform team, you cannot maintain all the network policies. So you want to delegate the creation of network policies to your application teams or your tenants. However, you would like to guardrails access to, for example, Qube API, Kubernetes API. But you also want to create cluster-wide network policies, which prevent any unwanted access. So you can prevent, for example, any outbound connectivity on port 22. However, obviously not on 4.4.3, because most likely some egress service need that access. However, you can use tools and automation and checks to automatically drop such PRs, which a given tenant may submit, to access of such a destination, and only allow specific FQDNs or specific IPs automatically. So what we see in practice as well is that, obviously, there's also this approval flow for given teams to approve a given policy. And we also see a lot of automation, which automatically checked for, example, CIDR blocks, which are definitely not approved to be allowed to access using a policy. Think also about resources in the cloud you want to protect from a given Kubernetes policy. So now I'm shifting to the part where I think Cilium is great to help achieving those goals using features which are specific with Cilium and EBPF. So for those who don't know what EBPF is, we like to say what JavaScript is to your browser. EBPF is to the kernel. That means that we make the kernel programmable in a very dynamic way, in a secure way, without changing the kernel. And this allows us to unlock a lot of features in terms of networking, security, and observability of which we talk about today. So think about when a pod or a process sends a packet, wants to open a port. These are kernel events we can inspect, we can secure based on the network policies we set. Cilium is built on EBPF. You don't have to be an EBPF expert to operate Cilium. Cilium abstracts this technology for you. So based on the settings you enable or disable, Cilium will be configured to do the awesomeness EBPF brings. And those are obviously in the space of networking, observability, and security. And what we're focusing on today is the network policies. The encryption part obviously is a huge important feature for enterprises. But also note, for example, egress gateway, which allows you to configure egress connectivity out of your cluster on specific IPs. For ingress, obviously we use also surface mesh. And this is not the topic for today, but that allows you to do, for example, path-based routing, MTLS, TLS decryption, for example, gateway API support as well. If we talk about security in Cilium, it's important to know that it's based on identity-based security. That means we're not tracking IPs. Instead, we assign a unique identity to a given workload. And that's based on the metadata you provide to this workload. So let's say we have a front-end service and a back-end service. They have metadata associated with those services. That unique set of labels will result in a unique identity. And we use EBPF to attach that identity to the data plane. And we use that identity to secure and observe traffic from and to this given service. And an identity is a cluster-wide property. So each node knows about this identity and can provide the metrics, visibility, security you need. Additionally, Cilium provides layer 7 security. So beyond the layer 3, layer 4 security, we also support layer 7 API-aware security, meaning that we can inspect and observe and secure specific API calls, for example, a specific method on a specific URL. And this goes beyond HTTP because we also support GRPC, Kafka, Cassandra, and other protocols for that inspection and security. How that works is using Cilium network policy. So this is a Cilium COD where you specify, for example, where to match this rule 2 and what labels it should allow from, so from endpoints. And then on a specific port, you can add additional rules in this example, HTTP rules to only allow a GET method, in this case, on the public URL. For egress connectivity, we also support DNS-aware Cilium network policies. This is super useful for cloud resources that you want to secure access to for those services because they change IP a lot. So if you look at S3 buckets, for example, if you have media content there stored and your workloads need to reach out to those, the underlying IPs will change a lot. So you wouldn't like to allow the whole CIDR block for those services, but instead, specifically allow an FQDN for that given service to be allowed egress connectivity to. I also mentioned the cluster wide network policies. And this is such an example. This will be applicable for every workload on your cluster. So in this case, it's an example where we want to allow access to Qt DNS and allow Qt DNS itself as well to use to do DNS request itself. So in this case, we have also used the entity world, which means any. We allow Qt DNS on port 53. But we also allow, for example, Prometheus access for all namespaces in this cluster and on a given port in this case. This is also how you can secure, for example, on a per namespace. Allowly. So if you want to template the default allow for traffic within a namespace, you can also leverage cluster wide network policies. If you have a multi cluster use case, you can create network policies with spans across multiple clusters. So using synium cluster mesh, you connect multiple clusters independent of their clouds. If you're run on prem on open shift or on Azure or or on GKE or both. This you create a unified psyllium data plane. And that means that you can create psyllium cluster network policies or psyllium network policies across cluster, which are cluster aware. So we understand the identity of each cluster and each cluster has read only access learning those identities from another cluster. And we can use again, those identities to secure and observe traffic across clusters. So what it means in this example, and I hope you can read it, the front end tries to reach out to a back end, which is in this case deployed high available across clusters. And using ebpf and this identity, we're able to secure and observe that traffic across clusters. That's done through synium network policies. So in this example, you can specifically allow access from a specific cluster to another cluster and deny and drop traffic from another cluster as well. So you can mix and match local cluster network policies with policies across clusters. And this is based on the cluster label, the cluster name, which is a unique identity in psyllium. And this is a simple example of such a network policy. In this case, we have a rebel based workloads where we want to apply this policy to and we allow ingress from the x-wing workload from cluster one on a specific board. So only from cluster one, that specific workload is allowed. Now, now having network policies and adopting network policies is extremely difficult without right observability. And therefore we have introduced obviously Hubble, which provides flow and visibility of your workloads across clusters. And this also enables you to do data operations on psyllium network policies. Obviously applications change and providing Hubble access to your teams allows them to learn what has changed and what flows they need to allow using network policies for the application to work. Hubble consists of three main components. The Hubble UI provides a namespace view of all the connections within a given namespace. It gives a view for namespace to namespace connectivity and also workloads in a given namespace if they access egress resources. We can see that if you enable DNS aware policies, we can also see the DNS requests, which I will show a bit later. And we can also see ingress connectivity. So traffic coming through, for example, an outside resource through a service through an ingress resource in your namespace. The CLI is a more powerful tool for advanced filtering, advanced troubleshooting. You can also output to JSON, which you then import in, for example, another tool, or to be able to export flows to other scene platforms. And Hubble metrics is everything about exporting these metrics, this flow data, layer seven data to Grafana, for example, using Prometheus. So you can create this metrics I mentioned before in Grafana dashboards for you to use. This is a quick screenshot. I will not spend too long here because I have a demo with this as well. But as you can see, it provides, in this case, a book info view. So I've deployed the big book info STO demo application. And I can see service to service communication. And this application is accessible in this example from the internet. So I can see ingress connectivity through our Syllium ingress into the namespace. And I can see the service to service communication, the ports being used, if it's dropped or allowed, and which protocol is being used. A great tool for helping you creating network policies is the network policy editor, which is reachable through editor.syllium.io. This gives you a view on a namespace level for both ingress and egress connections. And this is a visual tool to create network policies without understanding how to actually write the YAML from start. And you can write both Kubernetes network policies as well as Syllium network policies. And using the UI, you can allow or deny specific flows, which automatically creates this YAML you can download and apply on your cluster. And recently, we also released Grafana dashboards. So you can use access to Grafana marketplace, where you can find policy verdict metrics. And this is a great tool to monitor how many matches you have on flows in a given namespace or across a cluster. And this allows you to verify if you have created enough network policies to match all the flows and that you're fully secure. And also, if you have drops, you can obviously monitor that. And I will show that in the demo as well. Okay. That moves me to that. That introduces the demo. So let me set it up. I hope you can see in the back. Can you show hands if you can see the command line properly? Great. Thank you. So in this demo, I've deployed Hubble UI. So I've deployed a simple application for showing how you can transition from coarse grain to fine grain policies. And I also installed Grafana in my cluster. And this runs on GKE. And this is currently showing metrics for all namespaces. And I want to focus for this demo on the MyApp namespace. And as you can see, I only have matches on all. So that means that I have no, for this given namespace, I have no match of a specific network policy allowing layer 3, layer 4, or layer 7 network connectivity. And that's the goal for today to make sure we have matches across all flows. We're also able to monitor ingress. And below we can see what kind of services there are running and what kind of destinations and sources are being used. So let me move to Hubble. So this is a simple example. We have a client server connection. The client is talking to the server. You can see we have live flows appearing. We can see that it shows connectivity to the outside world to outside the namespace. And in this case, we can see watching the flows. It's actually accessing public IP addresses. We can see that the client is talking to the server on port 80. Currently, the flows are obviously allowed with this given namespace. And the goal is that we want to secure these connections between the net in within the namespace and egress to destinations on world. So what I do first is I'm going to apply a city of network policy on this network on this namespace. And the only thing I want to do is allow access to Qube DNS. And the goal here is that I want to learn what FQDNs this workload is reaching out to on the internet. So this is how it looks like. This is an example you can find our documentation as well. Let me quickly apply this file. So I'm doing Qube CDL apply file DNS YAML. So now this policy is created and Hubble should already, yeah, so it's really quick. This is live flows shows live actually the FQDNs it's trying to reach out. So now using Hubble UI and using the simple DNS policy, I learned that it's reaching out to ceiling.io and isovalent.com. And in this case, that's fine. And I can use this information to create a fine grained Celium network policy in my next steps to allow that specific connections to those destinations. And the next thing I want to focus on is on the client to serve a connectivity within my namespace. On the Grafana site, I can see that let me zoom in a bit on the five minutes. I can see already apply applying this network policy that I have flows matching because I'm intercepting this DNS queries from this given namespace. And we can see the purple bands created being created for showing that kind of flows egress. And those domains are also visible in my dashboards. And I have metrics how many flows are being identified. So let me now focus on the server components yet another prepared another network policy. So in this case, I want to allow only the client endpoint to reach to be able to connect to the server on port 80. So I've prepared this policy. And for sake of time and given this is port 80, let us assume that we're also want to allow HTTP. So this is a layer seven Celium network policy. And I want to inspect this HTTP traffic and allow this HTTP traffic. So save this and do QPCTL apply server. Now in Hubble, we already see that on the surface side, if I zoom in, I can see that I also now get HTTP context. I've applied this policy. I allow I allowed all HTTP methods and URLs, but now I can actually see what kind of requests the client is doing. I can use this information to make it even more secure if I wish to do so. So in this case, it's doing gaff methods on this base URL. And in this case, it's fine. Next thing I want to do is secure the client. Before I do that, we can also see on the egress side, we have more matches on the flows. And on ingress side, nothing has really changed as such. Next step is the client. So I use again, the information from the Hubble UI. It's reaching out to the server on port 80. So let's allow that. And we also learned in my previous step that it needs to access Celium.io and isovalent.com on port 443. Also note that I'm adding this server matching labels and obviously on port 80. Save this. Apply it. And then we should see. We don't see a lot of change in Hubble because we're already allowed that connection. We do know that it's allowed to reach Celium.io and isovalent.com. We do have already the server policy applied. But on the metric side, we should see that we have now also ingress policy matches. And we should also see egress policy being matched. That means that I expect that this egress policy verdict rates will show all green for matching traffic. It's already decreasing. So now I've basically secured outbound connectivity, ingress from the server component, only allowing access to the server on port 80 with HTTP. So now I'm safe to also delete the default, allow traffic policy for communication for that given namespace. So let me delete this. Obviously, this shouldn't change anything here because we have created the right network policies. And on the policy verdicts, we can now confirm both ingress and egress using this dashboard that all traffic are matched on this side and on this side. So what happens if an application changes or reintroducing a new namespace and an application in that new namespace needs to access this given workload? So let's have a look there what happens then. I need to go back to another directory. So I'm introducing another app. Not in the right directory. Bear with me. So this is just another namespace with a simple another client reaching out to this given workloads. Now let's have a look what happens. So on Hubble, we should at some point, once this workload is created, we should see drops because we're not allowing cross namespace workloads. So here you can see this another app is being created in an other app, another app namespace. And we can see here little red lines traffic is dropped. And we could also zoom in on the server components. And we should yeah, we already see some drops here. Right. So we're seeing confirmation that we have secured this workload. This other app is not allowed to reach this workload. And the policy verdicts also confirm that. So I have also metrics for this drops. So we need to do something about this. Let us assume that this is allowed. That means that we need to change in this case the server synium network policy. And let us assume that we already know it's HTTP port 80. So we're allowing that as well. So we're allowing layer seven access. And we select the specific namespace and obviously the app. So now we're going to the policies directory. I'm applying the server YAML. I just updated. So that's now configured. That would should result in hub UI also confirming that. Yes. Now we can see another app is allowed to connect on port 80 with HTTP to the server components. And the metrics confirm that as well because at some point we should see the verdicts drop and the traffic being allowed. I've one minute left. Final slide to conclude this session. You can try this out yourself. We have excellent labs hands on labs created on our website is of 80.com forward slash labs for both open source and senior enterprise to try out network policies hub UI and to learn hands on yourself how you can adopt network policies. There are also great examples of such cases using our studio.io website. There are some nice star wars based examples across clusters as well. You can try out if you want to know more about ebpf. I recommend to visit ebpf.io and visit our stand for more information. Maybe we have some ebooks or books left about this topic as well. We have a signing session tomorrow with Natalia. She wrote a book about securing using ebpf both on runtime and network layers access and processes. I'm open for any questions. Are there any questions? You can use the mic here. Good. Thanks.