 Welcome everyone to this session about how to integrate open telemetry with APM2 by Zharan Shah. We are glad they can join us today. Without further delay, over to you Zharan. Hello everyone. So I'm Zharan. I just have a small bit of introduction before I get started with this particular talk. I work with BrowseStack and I'm a senior software developer over there. And yes, with this basically I'll stop with the introduction. I'll start with the actual talk that we have over here. Essentially, we all know that APM has now going to release APM2.0, which is going to be a platform in itself, rather than just a simple kind of library. So yeah, so this particular thing is relating to the 2.0 version of APM. And hence we have something to offer with the platform that APM is offering right now. So this is talk about observability. I wouldn't go into too much detail of what observability is, but I'll give you a little bit of an introduction of each and everything and then there is a demonstration there. So this is the agenda for the talk. Why do we require this first of all? What essentially is this particular talk? And then the APM2.0 introduction to a little bit of how APM2.0 works and why essentially this is where it helped us build it because of the platform itself. Yeah, and then these are the points that we're going to go through one by one. So I'll just start with why particular this talk. So first of all, like APM2.0.0. Sorry to interrupt, Haran. I believe there are people who are not able to hear you clearly. You may need to speak up a little. Sorry for that. Sure, sure. Not a problem. Can you hear me now? I'll be a little bit louder. Just let me know if it's all good on the chat. Much better, Haran. Please go ahead. Okay, okay. Awesome. So essentially I'll just go through again once again. This is essentially the agenda for the talk is the observability part of any microservices or services that we have in general and how APM2.0.0.0 factors into that. APM has two point of platform right now, which is a new platform for developers, built for developers and we can modify, build new features around it so that those all things are what APM offers. And this is one of the libraries or plugins which we can integrate with APM and we'll have an observability feature, which essentially I'm going to talk about what exactly observability is. So this is what the talk is about. Moving on, APM2.0 essentially introduces a concept of plugins before even going to what observability or what our plugin that we have developed is. The plugin feature essentially is that we can extend APM features that we have already plus essentially new features that we want. Those features are something that not everyone wants, but there are few companies, few folks, few people, individual folks who would want to have that feature. So they can extend their feature set with a plugin, which is what APM2.0 as a platform offers. So for the same platform, we have built an open telemetry plugin, which essentially is a two app observability tooling system that we can use to trace down a request from wherever a user is and or wherever it is to the end of our service and we can trace it down to what exactly APM2.0 time it took, how many requests did we get for a particular endpoint, etc. So all these things are automatically available when you use this plugin and you'll have all those features available for you to visualize as well. So this is just a brief overview of how the open telemetry plugin is going to look like and there's a demo at the end. So I'll just show off the basic demo. Before that, again, what APM2.0 offers is like this is a basic introduction. It has a concept of distributed drivers now. You can have an independent decoupled driver and integrativeism, which is what I've been to point out first, but this talk is more to do with the plugin ecosystem that they're providing, which we can extend APIs. Not every automation you see is obviously required, all those kinds of commands that everyone uses. It's a very broad spectrum. So developers can or in general anyone who has some use case for a particular EPF example can develop their own plugins and integrate with the APM platform as you see over here. It's a very basic diagram of how it's going to look like. And you can install it during the runtime and use it as per requirements. It's more to do with a very generic ecosystem that APM is offering, which we are going to leverage with this particular plugin that we have. Now to the actual main topic of the talk is what exactly observability is, like why would we even require this particular plugin like. So basically it's to do with right now, as of now in terms of the increasing large systems and scalability systems and user bases, we don't have a whole picture of our system and internal space of the system. We do have logs and everything, but there is a limitation over there where it's human readable, but not very easily transferred to a particular way of showcasing how exactly and how much time it took and actually giving a very simple way of looking at the metrics of our system, end-to-end system. So observability, what it offers, logs is one part of it. Yes, but there are multiple things to it. It offers a way of us to identify the internal space of the system at a given point in time. But since there are multiple services, there are distributed services. There could be monolithic as well. It doesn't matter, but there are multiple parts of the monolithic services as well, which we need to identify and what is happening in each particular part of the service. So to do that, we have to transfer some data, some consistent data model. We have to define some matching data models and then use the data to visualize our system. So essentially, this is what observability is, is to give a particular consistent way of looking at the internal state of the system. And they are very commonly used cases, set up Microsoft in large companies who have multiple services. They are talked to by a messaging protocol or something. But we don't really have a picture of where exactly my request ended or where exactly it was an error. To get that picture, we can use observability as our way of mechanism to identify what is happening in the system. This is the tip of the observability. Any questions, obviously, you can ask me the chat. I'll go through this question later on. But this is just an introduction of observability to get more details about it. Obviously, you can check. There are multiple, multiple places where you can look for a lot of more information. Then over here, this is an introduction of what exactly you are providing. So these are the three pillars, essentially of what observability is, which number of requests or three, eight and everything. There is log, log-in information that we obviously do in a distributed system or in general, any kind of software that we have. Tracing is one of the key features that it provides is we can trace a particular request from end to end, where it went, how much time it took in particular one single part of a service. And all those things, it's configurable. All those things you can build. And that's how it's up to us to use how we want to use. It's a very generic tool mechanism that we have. So this particular talk is more towards the tracing part of it, where we request to Appium. But there could be multiple services before Appium that goes to Appium directly. And then we can trace down what exactly happened and if there's an error, something that we can track through as well. That's something that we can do via tracing because essentially that's what Appium is. It's a request-response kind of a mechanism. It's a client-servo model that they provide, which is very similar to obviously what Selenium does offer as well. So this is how our automation works in general. And that's why tracing is going to allow us to identify what exactly happened in Appium or if it did not even go to the Appium service itself, that is some other services that we have which failed, validation failed or something like that. So that's kind of a mechanism that we provide. And there are multiple use cases over it. I'll go through one demo as well so we can add it for what kind of use cases people have. This is kind of obviously an example of microservice architecture where there could be multiple microservices and there's a UI in them. There's a front-end as well. There's a backend database and everything. And there are parts of this where request flows where in one particular request flows while there are other requests that doesn't go to that particular service. So we can identify each particular request. We can trace down, we can filter those as well. So in general, these are the use case. There's a large use case because there is in kind of our system services that people have been offering nowadays and there are a lot of companies that are offering services for anyone, for business B2B or B2C. There are services that require us to identify in terms of a DevOps or SRE role. We need to identify what exactly happened and we need data to obviously track down. So this is where we use observability and we use facing abilities that we can provide internally via open telemetry, which is one of the standardized tool to build a consistent data model. So yeah, I mean, talking too much about it, I'll just showcase a demo and everything if there's any confusion, we'll be able to identify where exactly the question is. So this is the report, obviously that we have built and I'll start with a demo to showcase what exactly we are offering, right? In terms of the plugin itself, which is the open source, obviously you can go and check it out yourself as well. So before that, I'll just go through it quickly to the chat and ensure that there is anything that breaks or is breaking. Is it okay now? I mean, I haven't done anything, but yeah, it's just some form of internet breakage or something like that. Apologies for that. I'll share the slides with the team and there will be slides available for you to look into because all the information that I've said about and the audience are not listening or anything, it's already available in the slides, so you can go through it. It could be a problem Darren, with any of the participants also. They are fine right now. I'll focus on the demo now. So basically, now again, Apriam is offering a way to install plugins. This is the plugin that we have developed, which we can use. So essentially, there are readme's in the section, in the open source section of the repo. And then you can check it out how to install and everything. I'll just start with the plugin and then if any questions later on, I'll update in the readme as well to ensure that everyone understands so essentially this is how I started the plugin and I've started Apriam now. Apriam is currently working right now and so if I just call something, so I get a response back. So before that, if you see, my call is to localize 3000, whereas my Apriam started on 4723, which is my Apriam default server port. So this is where my proxy comes into play. This is just a very simple proxy to showcase the capability of having microservices and what Apriam does and we can use it Apriam as well. So over here, there is a call request that's gone through, which essentially is an exporter config and all those things, where all my traces that are being collated, I can export to certain servers, one of which is a service which provides a way for us to visualize the data in the trace data and so essentially right now, everything, all my traces, if you see over here, it's automatically generated. So before that, if I just showcase without a plugin, what would it look like? So yeah, I've started without the plugin. There is no plugin activated in anything you can see over here. I have available plugins to open telemetry, but if I do call request or anything, there is no route available for this because my plugin has not been activated. But now if I install my plugin and use it and then there's a added route, which obviously Accom 2.0 has offered for any plugin, they can add their own routes and now I can do kind of a call request and now my exporter config and all this. This is more of an API thing that I'll add in the readme stuff part of my repo. But apart from this, for example, if you see over here, I have a call request which goes through the config which is a get call and this is an active exporter config and all those things are available, which is more to do with how observability works. And now let's see how I can visualize this. Obviously, I can see the trace over here but there's no use to me because it's not very much readable. It's kind of a common genetic based on data. So let me... There's an API that I've exposed in the plugin which you can define your own exporters and pass all the configurations related to the exporter. For now I'm just using a default configuration for Yeager. So over here if I do this, I have successfully initialized the Yeager exporter. Essentially, there is a Yeager service running in the background that I've started with Docker. And if you see now, there is showing the Yeager exporter export successfully opened up and for one particular span. So now I have a call request which I do over here, for example, I do a call config call. So before that, let's visualize how it works. So if I go over here and to the Yeager UI, so that is my three services. If you see over here for Appium, which is the Appium service itself which we have our plugin. Over here it shows me one particular HTTP post call which is directly directed to the configuration page. But let's say I have Appium under a micro-service. So I can look into this and for example I can see there's an Appium proxy which has a middleware which is an express and it's middleware obviously stuff to do with Node.js express server. Then there is a rate limit, essentially another middleware and there is handle proxy which is another middleware and then essentially in the end it goes to HTTP get call which goes to Appium. Over here you can see there is some warning. Obviously you can ignore this one now. And for all the information around what the request was, you can see that there is all the request over here. You can phrase down to where exactly it went and from where it went to which particular service where was the endpoint and essentially all those things. What the endpoint was, how much time it took which is very important for us to phrase down any bottleneck into our system. All those things are available over here. So for example, let me see if I can change the rate limitation and let's based on that I'll showcase how the added part of it would just a second. So what the added part of it would give me. So I have my proxy running and for that I'll have to go through this particular repo. Sorry about that. So I'll just modify some of the rate limitation stuff so that I can showcase how in the end basically I can limit. So over here I have max 10 requests. Let me change it to one and then restart my proxy. So I have restarted my proxy now and it's running. So now I can make a call request with a normal debt call over here. My request has gone through and I can see it in the Yeager UI as well where my request is just a few seconds if you see over here. A few seconds ago this particular request now let me again request and I believe so there would be an error for me. So over here there are two requests I'll try again later. So this is where tracing would help me identify what exactly happened if I go through the trace page and I see there's another request and there's another request with an error. What's the error? I don't know. But over here it never went through Appume this is where we can see that my request never got through the Appume server itself and it was hauled by some other service that we have over which essentially gives me a non-200 response I would say which is essentially what I can see over here and track through. So this is where essentially integrating all the microservices that we have or even a monolithic independent parts of the system we can integrate together and showcase how observability would essentially help us identify bottlenecks in the system and we can filter out over here. This is how Yeager is a service which provides us to visualize and it has its own data store but there are multiple other exporters as well and it's available in the plugin as well if I open up my repo over here so there are multiple exporters available you can go through it like over here there are the readme pages there are multiple exporters available there is obviously Yeager we have one minute sure so we have Prometheus, Yeager and Zipkin so all those things are available you can go through the page I will share all the details on the page with folks over here and yeah any questions anything obviously there is readme over here and you can read issues over here open source or any queries you have anything we can talk over here as well on the data brilliant demonstration Dharan that was a great talk thanks Dharan for sharing your experience with us today