 Hello everyone, welcome to this webinar on enabling observability in Kong API Gateway platform. In this webinar, I'll be walking you through a series of steps about how to enable observability using open source tools like Fluvenbit and Open Telemetry Collector. A little bit about me, I'm John Williams and I'm a technology architect specializing on API management and API Gateway functionalities. I'm one of the recognized Kong champions within the Kong community and here you can find the link for my LinkedIn profile in case you want to get in touch with me. Let me set some context on Kong API Gateway. Kong API is a lightweight, fast and flexible cloud native API gateway. So any API gateway provides a point of entry for all your APIs that are behind your enterprise. An API gateway will also address your cross cutting concerns like authentication, monitoring, security and even caching. Kong Gateway ramps in front of any restful API and can be extended through modules and plugins. So this pluggable architecture makes this an extensible so where in case of a particular functionality or requirement is not satisfied, you can create your own custom plugin and then extend those functionality into the API API. Kong Gateway provides a variety of deployment modes and in this webinar, I'll be walking you through a hybrid deployment mode so where Kong is the product with a control plane and a data plane. Your data plane will be your on-time traffic. Kong has a huge support on the community and this is one of the leading open source gateways that are available currently in the market. One of the major characteristics that why people are moving towards Kong API Gateway is because of its high throughput and low latency provided on the gateways. So to learn more and more on the Kong API Gateway, so visit konghp.com. The observability and monitoring will usually coexist in an ecosystem and the major difference between these two are monitoring as many of you know is a traditional approach of collecting and analyzing data on the system performance but observability goes a step further. It's about understanding why behind the data, the root causes of the issue and how different parts of a system are interacting. So the first pillar of observability is metrics. These are quantitative measurements that tell us how our system is performing. So we are talking about things like response time, CPU usage, memory consumption and error rates. Metrics provide a real-time snapshot of system health allowing us to identify bottlenecks and potential problems. These metrics can be anything that are on the monitoring signals. So metrics can be your latency metric or the traffic metric or even error metrics. The second pillar is all about logs. So these are the chronological records of events happening within your system. Logs capture everything from successful transactions to error messages and user actions. While metrics provide a high-level view, logs offer a granular perspective, helping us to understand the sequence of events and even pinpoint the root cause of the issues. The final pillar is the traces. These map the flow of the request through your system showing how different components interact. Traces usually provide a visual representation of how a specific user action travels across your servers and databases. So they're like following a breadcrumb trial to understand the complete journey of a request. And as per Google's SRA principles, so these are the four golden signals that one has to monitor to make sure the system is highly available. The latency, its metric of the time it takes to service a request from end to end. So that is the request originating from your user application till the request goes through your gateway gateway, to your microservice and even to your database systems. Traffic measures how much of demand is being based on a system. So this is usually your transaction per second and errors are like, so based on the number of transactions coming in, how many they how many requests are failing with errors that are explicit or implicit. An explicit error is something that we get from an REST API like find error or 401 or 403. And an implicit error is like another business error where we aren't even to find the data on the database that we usually send in the body of the message. The last signal is the saturation so where it is a measure of your system's health contributing to your CPU, your memory, and even your disk IO operations, performances. The tools that will be used to enable observability in our KONG API KP platform will be Fluonbit which is a lightweight open social metric agent for logs, metrics and it is built for performance. This is designed for speed, it collects events from diverse sources without blocking down your system. Second one is your open telemetry which is a tool to create and manage telemetry data such as traces that can be used to the broad variety of packets. This helps you streamline your monitoring activity by routing all telemetry to a single endpoint. Let's see how we are going to instrument KONG API here with all the observability tools that we are using. So in this topology, so I have installed KONG hybrid mode which has KONG control plane and KONG data plane. The KONG data plane is the case where all your runtime traffic will become where the API request is saying all your metrics are being captured. So in this picture, you see like the KONG data plane represented here which stands on two parts having KONG proxies and they are in blocks as they all just do. And I also enabled a providious plugin which is a KONG plugin which I just captured metrics for the KONG data plane. I have also enabled an open telemetry plugin which I just captured to telemetry data required for tracing the particular transaction that is going through the KONG data plane which is the runtime. And on the default namespace of the Kubernetes, I am running the Flu1 bit and also the open telemetry collector. The Flu1 bit will take care of aggregating the logs and even the Prometheus methods from the KONG data plane and open telemetry will be taking care of capturing all the traces and then forwarding it to the our centralized log system. So here I have used Flu1 bit of the tail as the input processor and the scrape for the Prometheus processor. Let me show you the setup that I simply have in my network. So I am running a mini-tube cluster and I have also running the KONG on the hybrid model which is the KONG when you have the control plane and the data plane. The data plane will be serving all my traffic and on the default namespace you can see I am running my Flu1 bit agent and also the open telemetry agent. The Flu1 bit agent will be responsible for capturing the logs and methods and the open telemetry is for only the traces. Let me show you the configurations that are set for the Flu1 bit. So here you can see the Flu1 bit is configured with all the inputs and processor and outputs. So for the logs I am using the tail as an input processor which actually passes all the logs that are coming from the containers and by default it captures all the logs and I have given a input path which actually removes all the system specific container logs. The second part is the Prometheus scraper which is again input processor which actually captures the data from this KONG data plane which is running on 8104 and it actually queries this particular endpoint stash matrix and then it captures all the metrics we get by the KONG data plane. And finally on the output side you can see I am changing all these logs and methods to this particular endpoint which is logs and API for a new relic along with an API which is required for architecture. And similarly for the Prometheus scraper I am sending all the data to this metric API endpoint which will store all the metrics part. Similarly so we have a config file for the open telemetry agent where we configure the receivers which is running on 4318 and then once it receives all the traces from this it actually exports that into the endpoint which is again the open telemetry endpoint for the new relic along with the API key. So this is the setup for our fluent rig. Let's look at the KONG configurations that we have done for this particular demo. So I created two services here so one is which talks for two backends here so one is on HTTP bin slash anything so this is a route I used for sending traffic and for this service to be accessed via KONG API we created a route here and this route talks about slash anything and then so we have enabled two plugins for us one is a request transformer so where I added this to just to add an additional header that you want to send to the backend system so you can see here I have added this client test URL and the other plugin was Prometheus which is used to send the metrics to the Prometheus scraper which is being listened on the fluent bit and I have enabled only few of those metrics that are required for me so latency metrics and band and status code metrics so you can enable additional metrics based on your requirements and the third plugin is the open telemetry plugin so which I have enabled so that whatever traces that I am creating from the KONG runtime data plane it is actually being sent to this particular service endpoint here so this is the hotel collector endpoint which is running on our mini cube cluster on the default namespace so whenever this plugin generates any traces it will forward the traces to this particular endpoint and once the order character collects this particular traces it will forward it to the new relic system now let's see how the data that we captured using the fluent bit and the open telemetry character are being targeted in the centralized system and for this demo I'm using New Relic so this is the metric data that we captured from the fluent bit which is being forwarded here so as you can see here we are capturing the all the metrics coming from the KONG and here you can see the number of requests that are getting hit on a total between the time frame that I have run the traffic and the other metric that I will interest in the request latency so where this talks about the total request latency that is being happening from the from the origination of the API call till the response it gets and then this will talk about the upstream latency which is our the backend restful API for which we are processing the traffic now let's see the logs that we captured using our fluent bit and so this is the API that I hate from my one of the insomnia which is the restful client so as you can see here I'm able to query the logs based on the information that I've given here and all these are getting recorded here and not only the API traffic we'll be also seeing the internal errors that are happening within the system so like here since I'm using the open source container it is complaining about the common licenses not available so all this information captured here will be helpful when we try to debug some issues that are happening in the system right so this log will show you about like what the status code that we have got for this particular request there's 200 and then how much time it took for this particular API so it is 526 milliseconds and from where the request got originated so all these informations have been captured in our logs now the final bit is the traces so where we are be able to trace the entrant transaction that happening from the kong request till the kong response is being sent so here you can see the total traces that are being sent for each of those API calls that we have picked from the request so this is how we can see that each span of requests that are happening in the kong life cycle so here it shows like there are six spans available so starting with the kong router and then we have enabled two plugins so one is the again open telemetry and then it's again on the request transform so it shows that on each step how much latency that it took so as you can see here so from the kong side it took very less than 0.02 milliseconds here and then it's once it turns to kong load balancer it stops with the back end so that's where we have incorporated now once we instrument our kong apk with all the observability tools to 100 and open telemetry how can we make sure that kong system will be highly available and can be observed so take an example assume your users are experiencing high response times on the applications now the steps that we can take is to look into the metrics and identify which API is giving a high latency response and then we can use logs to identify like why there is a high latency on the particular API and then we can also use traces to identify like at which point of the kong processing or at the kong processing where there is a latency another example is assume if you don't know that error rates are being showed which is 20% with the stacking of any system like where it can possibly be like the system may be experiencing some issues and so now the step that we can do is to identify if the error is on a single API or across the different APIs so using the metrics that we capture which is the latency and the error metrics and then identify if the system is saturated on the sources so we can use metrics like CPU usage or the memory usage on the kong platform and then if there are such issues happening where there's a resource crunch then we can maybe take actions to auto scale systems or maybe add more capacity to the system to make it more highly available thank you so much for listening into the seminar so below here you can find the references that I have used so this blog post talks about how to deploy your kong gate pane hybrid mode and the second one github is more of all the code and configuration letter to all this build demo so you can find the number of repositories here so you can go and refer this and in case if you have any questions please let me know and thank you so much for your time have a nice day