 So welcome everyone to the session named a peek into observability from Testers Lens by Parveen Khan. We are glad she can join us today and yeah just without further delay over to you Parveen. Thank you so much for, thank you so much first of all thank you so much for having me at APM conference and I'm so super excited to present. I can see some of them are trying to join the session maybe we can wait a couple of like seconds or something to give people to join the session maybe just maybe few seconds more and then I can get started because I can see people. Still joining. Sure. Yeah. Yeah. And then I can give the start. Let me share. Let me quickly share my screen first and yeah. So yeah, I mean I think I'll get started without delaying. I can see a lot of people joining in so yeah. So hello everyone. Good morning and good afternoon and good evening wherever you're joining this session from. I'm Parveen Khan and I'm a senior quality analyst consultant at ThoughtWorks, which is based in London UK. So today, I'm going to share my experience and my learning experience and the talk is all about like you can see the title is a peek into observability from Testers Lens. There might be a lot of you all who already know what observability is or maybe there are a few of you who are new to this. So the purpose or aim of like of this talk is to introduce you all to this topic and to give you some food for thought. So how it could be helpful for Testers. So this is basically my own story. What you can expect from this entire talk is like I'll share my story of how I got introduced to this term and how I learned and how I introduced this in within my team. And I'll also try to share some of the theory part of what exactly it is. So, yeah, let's get started. Before into the jump before even I jump onto the topic of like to say, oh, what this is what is observability and this is how we do it. I really want you to quickly take you through this simple scenario and the credit for this scenario goes to pure of insert from where I took this inspiration from. I like this how he uses examples so I thought, let me use this as well so. So imagine. What would you do if you kind of come across this kind of a foggy road while you're driving in a very weird weather condition. Okay. Think about that. So one thing what comes to your mind. So one thing that comes to our mind is that we need to slow down right. Why do we need to slow down. It's because we don't have the visibility of what's ahead of us, and we kind of consciously make a decision based on the risk, right. If you're not, if we are not able to drive does that mean that we are bad drivers that we can't drive fast enough on a foggy road, not at all right. In fact, we are good drivers because we are making decisions based on the risk. So does that mean that the car isn't good enough to drive faster on this foggy road. Our cars can drive much faster but they but we as drivers has have to just hold them back because we know it's bad to drive when there's no visibility right. So basically in this situation we are kind of ultimately stuck. But how about these planes like these airplanes all the time they all the time fly among the clouds right. So it is because pilots have additional instruments to do that and they don't have to solely rely on their eyes. What now, what is it to do with the software development. Okay, we all know that the current trend is all about going faster and delivering value very faster and we all are kind of trying to adopt certain practices like agile DevOps working in distributed system, working in microservices world. Right, and all they're trying to do is this because we want to deliver something with some value quickly right by doing this, all we are doing is we are building faster cars. The reason why we are moving into microservices architecture or maybe trying to work in distributed systems is because we want to. We want some kind of simplicity while development right, but on the other hand, there's a lot of complexity and there are multiple moving parts at the same time which means it, it is even more complex. So, when distributing a system we are also distributing the places where things might go wrong. So, we know we need visibility, but how can we get the visibility into a system. So, the answer is by having observability. Before even going ahead in trying to understand what observability is let's try to first understand why do we need it in first place. Like, I always try to look for real real life examples to understand like any kind of given concept and I kind of don't get convinced by just reading through some theoretical concepts. I'd like to share my real world experience with you of how I came to an understanding of why we need observability. So, to give you a bit of context, I joined a new team, and the entire team was totally new, and I kind of had an opportunity to join this team and work on a really very exciting and interesting product. It was an automated inverse system which was built on an architect microservices architecture. And it used to like a bit more context was it used to take real time data on a lot of great controllers and to take care of the business credits. So, it was kind of very complex. Okay, the work which we were doing as a team was to build like new features of the way we as a team joined. So, as a tester, when we start working on a new product, what do we try to do? What is the first thing that you try to do? Like, you may try to answer like you may just drop something on the chat like what is the first thing that you try to do whenever you join a new team, and you know that the product is completely new and the domain is new. What is the first thing as a tester we try to do? So, yeah, requirements gathering. So, we read documentation, yeah, I think, yeah, that's what like we try to we try to get the requirements we try to read the documentation to the reason why we read this and try to gather this is because we want to understand the product right yeah understand the architecture yes. So, we want we want to understand the architecture we want to understand the product right, so that's exactly what I was trying to do as well when I joined this new product. So the more I was learning about the product and trying to understand I would feel somehow that oh this is too complex. I had that feeling that this product is too complex. The reason to feel this way was that I had seen some kind of pattern in in in few weeks of me joining that kind of lot of tickets were being marked as blogged. Okay. I saw a lot of production issues each day and like I saw that developers would pick up and I of course I used to play with them. And we used to try to investigate to find the root cause. And the team used to spend quite a lot of time and like spend days and weeks and then mark those tickets as blog as they did not have any kind of information about what is what is causing the issue or like maybe where the issue is. So, that made me think about what's wrong here like you know I stopped thinking that at this point I really stopped thinking that product is complex as an excuse. So interestingly enough, at the same time, I was trying to read a lot about observability without knowing Ben and where it's used. I was reading but I was trying to understand the concept by using like different tools that really promise to deliver this to see what it brings. Having at the same point like having having conversation within my team and with my team member with one of my team member I can say, who was a developer gave me some kind of food for thought that oh this is what we are missing on a product because I had this conversation and why do we why can't we find information about those issues or something. So that's where I got this food for thought that oh this is exactly what we are missing on our product, and that's what is called observability. The conversation was kind of a light bulb moment for me because that kind of unlocked quite a lot of answers to few of my questions but of course it did open up a lot of different questions as well. I understood and I got an answer that we had very less or no visibility into our system, which is why quite a lot of issues were marked as blocked as the team could not debug to figure out the root cause. And because they could not debug, they could not find out any information or they could not say how to fix or what we could do ahead with that. Now that that was the situation where okay that is exactly where I understood okay this is what we need on the system and I tried to I tried to learn a bit more about it. Now we know, I gave you a bit of context about why we needed observability as a team for us. Okay, now I keep talking about observability right now let's take a look into what exactly it is okay. There are quite a lot of definitions that can be found if you Google it, but one of them is a simple one which is like it says that observability is a measure of how well internal states of a system can be inferred from its external output. Okay, it means that you can answer any questions about what's happening on the inside of the system just by observing the outside of the system and without having to ship you code to answer those new questions. But when systems are down, we need to find answers by asking questions as quickly as possible. Okay, and all the way if all we are doing is asking questions and not getting an answer means that there's something lacking. So, the system needs to be observable so that it can explain what what's happening inside. Okay, so, so that we can find out what's not just what's happening inside of the system by just looking at from the outside of outside right. The question is, how can we make the system observable and how can we make the system visible. The answer to that is by using the data that now how can we get the data and you might think like Oh, how can we get the data and what type of the data do we need to make the system observable. So, we can get the data to make the system observable by adding instrumentation and that instrumentation can give us the data that can be in the form of either it could be in the form of logs it could be in the form of traces or it could be in the form of metrics. Okay, now I'll pause the story a bit here and then I want to talk a little bit more in detail about what exactly logs are and what are traces and what are metrics to understand more about this, how we could use this kind of data to make the system more observable. So, first let's look at logs, what are logs. A log, you might have already seen this quite a lot of people might already make use of this like a log is a simple message which has some kind of information right. It might be, it might be a timestamp like you know a simple, a simple piece of information where it has a timestamp and has a payload, which can help us to give more context right but looking at the logs. It might help us giving more context like it might give us information about it depends on what we log right it gives information about which service is it coming from. What has it done so some kind of a payload and at what time did it occur so kind of this is what the log is. Again, getting back to, again, if we are talking about distributed system. We do not want to get into like each service and try to look into the logs. So, rather than having logs for each services differently, we need to have log, we need to have them centralized at one place so it's, it is easier for us okay. It's not about just having some logs in place to make this is observable, but it is about having centralized logs. Let me give you an example of why I think why I am saying that we need to have centralized log okay. So while I was working with this team. We used to have logs like it's not that we never had logs we used to have logs, we, but they were stored separately for each service, and we used and log for that, and to access those logs. We had to get into SSH into each of those services, and the only way to view the list of the entire logs was by using not bad plus plus. Whenever there was an issue, like you know whenever we wanted to find out something what's going on, we would end up having multiple not pass not bad plus plus tabs open. And it was like such a pain and add to the way we could search and to add to that pain, the only way we could search. If they like you know if you want to search something from the logs it was control f like can you imagine how painful that was. And this is the reason why we should, it's not about just having logs but this is the reason why we should have logs that are easily searchable. And the way it can be easily searchable is by having structured logs. So, at this point we kind of had, like, you know, having to get into each service like by the time we move around for different service to look at what information is it is giving us. We kind of get lost of the like you know we lose the context. So that is the reason why we need to have all the logs in a centralized place so that they are easily searchable. Now, after looking at the logs now let's look at metrics. Metrics could be like you know it is kind of a very simple friending number or you can say it's it's kind of a simple value that can express the data about the system. So, these metrics, like, you can see like these metrics would represent different things right. Like, they might be like system metrics, you might have some application metrics you might have some business metrics. So usually they kind of are calculated over a period of time. For example, if you say a system metric can tell you how much memory is used by a process out of like in total. Okay. An application metric could be something like showing the number of requests per second being handled by a service or it could be anything like error rate of an API. For example of a business business metric could be something like Oh, how long does it take for a user to log in or how long does it take for a user to navigate between different, different pages like something like that. Okay, depending on the product you have. So, metrics are really good at aggregating things but not really good at pinpointing specific detail about something like Oh, at this particular time this is the customer who was having the problem. How could we do that. It is by using some more data right in the form of traces the next thing to understand is traces. So trace is kind of traces are something like it is kind of telling a story, which kind of gives you more level of details. Okay, it shows the entire flow of the request and I think it's kind of really very valuable while, while debugging. So, a sting, a single phrase shows the activity for an individual transaction or you can say as for individual request or an event as it flows through an application. It kind of shows the end to end request and and phrases are really critical part of observability as they provide a lot of context. Okay, so I have been saying quite a lot of time that you might have noticed that with observability we can ask questions, but what kind of questions can we ask. Okay, I'll give you an example, I'll give you some, it could be any kind of questions depending on the context of your product you're working on but I'll give you some of the examples of what kind of questions we can ask to find the answer if we have system, or visibility of the system. Okay, it could be like, Oh, why is X broken, why is my service broken. What went wrong during these, what went wrong during the release like you know, or it could be like what other services dependent on my service like you know where for example if you're working on a one single service. And what when, why has the performance degraded over the past quarter, what log should we look for right now when there is some kind of issue or what did my service look like at point X. So these are some of the examples which you could, you could ask while, if you have while looking at the logs or by looking at the traces. So just like, just like how we talk about DevOps, we cannot say that we are doing or we are adopting or we are following DevOps by just saying, but by just having some automated tools in place. It is more than tools right it is more than it is, it is a cultural mindset change, right. Similarly, we cannot say that we are following or approaching or doing observability by just having different tools in place. It is not just about getting the tools and having lots of data in place and saying that Oh, we are. We are following observability or our system is observable. It's not about that it's a it's a it's a cultural change and it's a mindset change as well. Now, I'm sure we all are testers right now, most of us among the audience are testers right so now you might be thinking like, Okay, we get it observability is this and that fine but now what what's in for testers with all this observability and all these new tools in place. Why what how does it help testers right so it makes easier to find more information around issues like I'll give you some of those some of those benefits I can say not all of them but some of them. So one of them would be easier to find information around issues so for example, while we are testing we might see some kind of unexpected behavior or maybe we kind of see some failures right. So having these kind of having these kind of tools in place. Kind of allows us to look under the hood to find what's happening with each of the requests, like you know, and not just that it allows us to learn more about a system and how it helps us in learning how it communicates and books. I mean, the more previously like the more I would use is the depth tools to see what's going on. When something did not look right to me while I was testing but I wouldn't get enough information so by having these tools help me in getting more information that can be like for example added to the tickets like you know when while we are raising the bugs or while having the conversation with the team, when like you know which might help the team to understand to give more context and information around any kind of issue right. It's not just about finding more information while looking for looking into the issues but it could also help us uncover understanding of our products. Testers like it could be another reason another help or another useful benefit of benefit of this would be for testers would be like a tool for exploring and asking questions. Now how many of you all think that we are testers are curious explorers. You can just drop drop something in the chat saying like you know, do you think testers are curious explorers. Yes. Yeah, they have to be yes of course yeah. So I think we testers are very curious explorers and we are great at asking questions right as a tester. I mean of course I'm sure we all like I tend to ask a lot of questions when I don't understand things. So, as we know the as we say that testers are great at exploratory testing, not just good at asking questions but the testers are always curious to find more information about the system right. While exploring the logs and metrics and any kind of like these kind of data testers might point out more point out where more instrumentation is needed, and not just that, but it also supports and helps testers for testing in production. So it kind of allows a team not just to shift left but shift right. Like for example like you know when it's not just like you're making use of the data but you be as testers while testing and you're making use of those logs or metrics or any kind of data. We might say that oh we are not finding any of information. When something is gone wrong can we improve our instrumentation can we improve our log so that I can see what's going behind so that whenever there's this kind of issue is in production then we would know there is enough information to figure out. Now I'm not saying that if you have logs. It's always like we will able to find everything but it might help. So we as testers, because we are very curious explorers we explore the entire product. We might also try to add the value of improving our instrumentation when needed. Like I saw this tweet and I really like how Marit has put this together saying that a lot of times, good debugging and good exploratory testing are both indistinguishable. Okay. So when, when the developer explores, they call it more often as debugging. Whether they know there is a problem or this kind of suspect there is there is a problem what do developers or anyone like you know, while debugging what do they do do they know that where the problem is no right they try to find where the problem is that's what exactly debugging is. So when testers explores the same debugging is close to the last word, right it is similarly like we don't know, like we don't know but still we are trying to explore and learn more about the system. So, to kind of like summarize. I would say by making, by making systems more observable anyone on the team can easily navigate from the effect of the course in the product of the system. Okay, it. I'm not saying that you get the answers straight away but it makes it easier to debug. The goal of observability is not just to collect logs, metrics and places, but using the data to get the feedback. And it just doesn't allow us to find the knowns of the system, but also allows us to know and find out unknown unknowns. Okay, it's not because it's not all about what we know. What we know about the system and we found out what we know right it is more than that so it is all about finding the unknown unknowns and making system observable will help us in figuring out what are the unknown unknowns of our system of our product of our application. And, like, every learning experience or every story or every journey has something to take away right. So, even I had some, I had some of the takeaways to out of this learning experience while trying to, while trying to understand why I was trying to learn, and why I was trying to implement these within my team, thinking outside the box. So, the key takeaway for me is that we as testers can, can go out of the way, like you know, I think we care and advocate about quality and that could be related to like the way we try to bring and like if you see something wrong or if you want to improve our quality. It could be different ways we try to improve that like right. So it could be like, we try to improve bring improvements in the process, maybe bringing in the new tools ready to test automation, but that's not the limit. I learned that we do not have to limit ourselves and say that okay, like you know this is not related to testing so let's not look into this, or let's not learn about it. So, while throughout this experience throughout this process, I saw within my team that I saw that there were problems within my team my team was going through I saw the problems that team was going through. Like, for example, developers were getting frustrated when they could not resolve the production issues and I saw that I noticed and I saw that product owners were getting frustrated for like not having an answer to provide those answers to the client. And they were not having enough information related to those production issues. I didn't know the answer or I didn't know the solution to it, but being active in the community and seeing new tools and the concepts and exploring them and finding and finding the solution and then trying out myself by using different tools. And I tried to present that to my team as as kind of a like you as a suggestion to my team or maybe you call it as a quick, like you know, a POC, like you know a proof of concept. So, I tried this. So, not limiting myself to just testing tools only and trying to think outside the box to help my team. Before I left that team, we were not there yet in terms of implementation for the entire observability part but we had already we started by after this we had our first step started was like from having visibility to like we had no visibility to having structured and like centralized logs that could be easy that way very easy to query. Like you know how like the steps we took where like smaller steps by starting to get implement the centralized logs first, moving all the logs into one place so that they are easily searchable, then trying to implement some races. So, we took smaller steps one by one to start implementation. So, to end, to end with, I would really like to say that observability gives power to the entire team to get the visibility when needed. It's not, it is not something, it's not a tool that only can, it's not a tool or it's not some, not the data just used only for certain roles for certain people but it helps it gives the power to the entire team when we get the visibility. And observability is much more powerful when you apply like kind of with the right mindset and the clear kind of a process and using different tooling in place. It also allows the teams to it kind of also allows the team to become more productive towards the issues rather than reactive. Again, I'm not saying that that will solve entirely by having observability but quite a lot of time what happens is like we, a lot of time we know there are some kind of issues only when the users complain. So, by, by having these kind of things in place. It kind of allows a team to become even more proactive, rather than being very reactive. So it kind of gives power and kind of empowers everyone on the team whether it's kind of a, again, as I said, like it could be developers, it could be ops engineers, it could be engineers or it could be SREs or it could be testers. And we as testers need to be need to be comfortable using these kind of tools and so that we can learn more about the product we can add value. We can add value to this kind of implementation. So, if there is one thing apart from getting introduced to observability and how this observability is helpful. One thing, one, if there's one more thing that I want you all to take away from my talk today is that we all believe in like we all believe in diversity and having diversity means building powerful solutions right. Then why not have diverse roles like developers, testers, ops engineers or any other roles or SREs be involved while building these kind of solutions, like while building like observability on the system. So this could add so much value when we as a team try to build these kind of solutions rather than just saying that oh this is not something for us and it could be not useful for something for us. So this is one thing I learned throughout this that we as a testers can also add value to this and can also make use of, can make use of these kind of tools and observability. Yeah, thank you so much for joining my session today and I'll be happy to answer if you have any questions and, and yeah be sure to check out my blog site I do blog. I share all my learnings and I share all things at on for being counts.com. This is my blog site and don't forget to follow me on for being underscore content. Again, for joining my session and listening to my story. Thank you for being there was really a great session thanks for you know all those understandings on observability. I learned a lot. I am not basically from the test of background but thank you so much for all those, you know understanding it was quite informative. In terms of questions I I feel the session was awesome nobody has any questions so far it was so good with the examples everything. So, yeah. I think yeah I don't see any questions so yeah. Correct. Being a host potentially you know I can ask some question. If nobody has since I said I'm not from a tester background, purely from black and ingenious perspective. I always had this question probably nice time to ask. So, we do this. Being a developer I generally do unit testing I write you know component testing component test integration test etc function test. I always, you know used to wonder how you know q is when they write their test etc how you know they are different. And, you know what we do all whether the roles are actually quite emerging in one on the other. So, if you could, you know, give me some insight into that probably you know few people like me who are from the non testing background, they will also you know get help from this particular answer. Yeah, so you mean to say is like how like yeah come again can you come again the last part because I was just trying to stop my screen chain like you want to see like how testers can, how, how do they can you just repeat that part please sorry I'm so sorry. Hey, no worries, no worries. So just to repeat. So it's quite abstract by the way, the question is quite abstract. The idea is that we always write this unit test you know developers writing the test and developers do those debugging and when testers are doing exploratory testing you know some of that merges for a period of time. So on the similar lines it's like when you know we, you know, do sort of a TDD BDD, you know we write test. Yeah, yeah, yeah, duration test etc. I always wanted to know what would be the mindset of, you know, a tester per se when when they do the QA, right, with what sort of a mindset and how does it differ from the developers mindset. So that would basically help me to understand the two roles, you know, different. Yeah, yeah, I think, yeah. I think you're trying to understand the mindset. Yeah, I mean, it's, it's, it's hard to explain but one thing I would say is like, I think as a tester that's what I was mentioning during my talk like how this concept will help us is because like, we have this mindset of like, asking those questions being curious about how our users are going to use this when it's like when we release it's like for example if you're working on any kind of a feature. When I'm testing that feature, and I'm looking at the different layers, even though I'm looking at the different layers it could be API is a code or pairing with the devs on the unit test level. I always have that user hat on me and always have that curiosity hat on me saying that okay why are we doing this why are we doing that. So I think having that kind of a mindset always, that is more of like, you know, while exploring users, mine is always in our like, you know, whatever we are testing so that's, that's one thing I would say like that's how the mindset would be like you know when we are testing how differs like I don't know how exactly the mindset of developers is while they are actually writing the code or like, while they're implementing the feature what they think about. But when we as tester I can talk about everyone on like as testers but I think it is very quite common that when we are exploring we make sure that we try to use different personas like not just one Oh this is a user and he will use he or she will use it in this way, but trying to understand trying to think from different personas point of view, how and like you know and trying to see how the journeys would look like you know not not just that simple piece of puzzle like you know we look at the picture not just the one single piece of puzzle or if that puzzle works fine. That's it yay no. I think we look at the entire picture and then try to look from different personas and different user perspective and I think that's that's how if I say some of the general thinking kind like you think that's how I would say yeah. Perfect. I think yeah that that answers my question. Thank you for that. Thank you so much for that. So, oh, I think we have one question now. I think that's more of like an answer to work for me. That's correct. That's that summarizes what we have just mentioned. Yeah. Thanks. Yeah. So, yeah, I think we are we are pretty much at the end then I think if there are no more questions. And even if you if even if you don't have any questions maybe people can still like if people ask I can see still people around so you can just drop in some kind of conversation and it could be conversation not kind of a question as well. Even that is fine. Yeah, I can see Serena saying that they think around to make make things work, but we testers think around to found H cases to break it exactly. I do agree. Yeah, I do agree not exactly like yeah we try to break it in the sense of like just trying to break it no but just trying to see if it breaks it in some different kind of H cases if people are like if different users are using it. So I do agree with that yeah. Correct. I think so yeah I think that's like pretty much end of first session. Thank you for being for sharing your experience with us today. Thank you so much for having me and thank you so much for being an amazing host.