 So thank you for coming here. It's a pleasure to stay here to be here with us. So today, we talk about how the Bank does observability with multi-tenants using pro-miches and tenors. First, let me introduce myself. My name is William Costa. I work at Ital Bank as a SRE. It's my head facing in Star Wars in Rise of Resistance. I'm a huge fan about those Star Wars. It's an amazing place for a huge fan. You can follow me just scan this QR code and add me in my social in the community. And Rodrigo Serra is a speaker too, but he can't to be here for a good reason, OK? The Rodrigo is a father now. So it's a beautiful moment for him. Thank you, David. So what is our mission inside of the bank? Today, we help the teams along their observability journey. And we assist all the squads that support the platform of metrics, logs, and traces inside of the bank. So let's see our agendas for these sessions, OK? I will give a brief introduction about the Ital Bank, OK? So we see how we work inside of the bank about my team. The challenge of metrics is inside of the bank and how we took to build it, this big platform. Let's see the components and architecture about the tunnels and culvernets. We have big numbers, some big numbers about the some components, OK? And who monitors the monitors, how we made the monitor of this big platform. And let's see some panels about inside of Grafana that we have and more users use in the bank. And next steps along this year. So the Ital Bank is a huge bank in Brazil. We have 98 years of history, OK? We are a full-service bank in Brazil. We deliver many kinds of financial services, sorry. And for example, insurance, investment, credit card, and check accounts, and others, many services. We are present in 18 countries, OK? The main countries is Brazil, Chile, Argentina, Paraguay, and Uruguay. So who you are inside of the bank? Data will have some strategy by community, OK? And my community is really ability community. And really ability community have some strategy about the reliability and one of them is observability. So we support both system and environment of the bank, the truth observability. Metrics, blogs, and traces are inside of our community, OK? We deliver journeys through in their search and we love that all users practice this support. And we are open source first. We support the wall users and the wall teams make best solution using open source projects. So let's talk about the challenge of metrics inside of the bank. Today we have two main environments inside of the bank. We have the cloud environments and on-premise environments in our data centers. So build a big platform of the metrics that offer some points, it's very important. So today we have about 5,000 accounts and WS accounts among development, allocation, production environments. And we have a lot of workloads running inside of each account, OK? We have 13,000 hosts and heat applications in our on-premise environments. And it's very important to offer one single panel for visualization in these both environments, OK? So we need to make alerts inside of this platform. Then we offer to the both environments. We work at 24-7 operations, the bank never stops, I think. And we need the scalable and resilient platform. The name inside of the bank, the name of this platform is Cloud Metrics. So how we touch to build it, how we think when they made this big platform. Availability and scalability and resilience for us is very important here, because monitor both systems and environments of the bank. So the keyword for us was isolation. So for us, the most important was to isolate the max components possible, including your infrastructure and application. So each account has their tenant, and each tenant needed to be isolated, OK? So it's very important, because here, we try to reduce the noise between tenants, OK? So we're talking more about the next slide. So let's see what is the confidence we choose for running in Cloud Metrics. For one, today, we have the Kubernetes for running. We have three main clusters, each cluster by environment, OK? For collect, we have some journeys available for wall users that can deploy in your account. For us, the main core, the main application here inside of the Cloud Metrics. For visualization, we have the Grafana. Wall users love Grafana for visualization. And for Alert, we have the Alert Manager for integration with some channels inside of the bank. OK, let's see how we made the infrastructure of the Kubernetes of this platform. Today, we use the AWS EKS, OK? We hosted in São Paulo, in the region of São Paulo, Brazil. Bond nodes are deployed in three AZs, OK? And we have the main conception here, where each node group have running on one specific workload, OK? So for example, we have one load group for network, monitoring. Here, we talk about the tunnels. So we have one load groups for specific for tunnels received, tunnels query, and other many workers that run inside of their clusters. Let's talk about the application. Tunnels is the main application here, OK? So here, we separated in three main layers about the tunnels components, OK? So the first layer here, we have Tunnel Squared Layer, OK? And the Tunnel Squared Layer can get one metric inside of your specific tenant, OK? The second layer, we have the Tunnel Receive Soft Layer. The Tunnel Receive Soft Layer receive one metric that arrive inside of the cluster, OK? And the middle layer, we have each tenant with some components. Inside of each tenant, we have one-tone store, one-tone receive hard, and one-tone rule, OK? So here, we have two main drafts about how we implemented the Kubernetes infrastructure and the Tunnels application. So let's see how our users can see your metrics. I think that here, it's common for all of us. We have the Grafana as the main tool for visualization, OK? Grafana send wall queries to the Tunnel Squared Layer. And Tunnel Squared Layer get wall metrics inside of your specific tenant, OK? For a store, remember that we have accounts by tenant, OK? So each tenant have your account. So inside of each account, we have one bucket, we have three buckets, OK? And the tenant access this bucket by a cross-account policy and can store and read wall metrics that have inside of this bucket. For scrappy wall metrics inside of each account, we have the Promeaches. Today, we have some strutters about how our users can deploy Promeaches. We have Helm charts for Kubernetes. We have the CloudFormation and Terraform for ECS and others, ways that we have inside of the bank. So when Promeaches scrap your metrics inside of your account, send by remote write. The remote write is a proxies that you have inside of the Promeaches. And when wall metrics arrive in Tunnels, receive salt layer. The Tunnels receive salt layer, identify your tenant, and direct your metrics in your specific tenant, OK? One point here, an important point. Sorry. Here we have an important point about how works Tunnels receive salt here. The Tunnels receive salt layer works like a proxie, OK? So the Tunnels receive salt layer met your tenant ID and always forward in your specific tenant's receive hard. So here we have the simple draft about how we made alerts. Remember that I told you that each tenant have some components, OK? And Tunnels really start wall rules and record rules that you have in your tenant. When you have one file about the one rule, the Tunnels really trigger alert manager and alert manager trigger your channel. Here we have, inside of the bank, we have some channels available, OK? These channels are ServiceNow, Microsoft Chains, and MAU. So who monitors the monitor? How you made the monitoring this big platform? Today we have the same structure and concept and separated environment for monitoring wall components of the cloud metrics. Here we have one example about the numbers and graphs about the health environments of the cloud metrics. Let's see some big numbers about some components here, OK? So about the components today, we have three clusters. And for each cluster, we have 200 nodes. It's a big cluster. We receive 6,000 requests per second in more English, OK? 7,000 containers running inside each cluster. About the graphana, we have 5,000 dashboards, OK? And 4,000 active users. About the Tunnels, we have 3,000 Tunnels tenants. We generate 1.2 terabytes storage by day and 1.5 billion samples per day. So here we have one example about the dashboard inside of graphana, OK? And this dashboard is we have traffic lights about the resources in AWS. And when you click in one specific traffic light, you can see graphs and numbers about the resource. So it's very important for us because the bank have a big strategy that move wall workloads to the cloud. We support that wall users make business monitoring with using metrics for following your nearer time, OK? And here we have one example about this practice. About the SLISLO, we support the wall teams make SLISLO. It's very important for us. And here we have one example about this solution. For next steps along this year, we need to deliver a single pane of glass with graphana. At the moment, we work for deliver this. And we would like to implement Keda and Carpenter in wall clusters, wall clusters that we have, and deliver a single journey for alert inside of graphana, OK? So I think that's it, guys. Thank you for coming. Thank you for KobyCon staff and any questions? I'll stay here for a sec.