 Welcome to this demonstration of the Splunk APM application performance monitoring tool. In today's increasingly complex applications, we have microservices, we have public, private clouds, hybrid clouds, multi-clouds, and we have elastic and terminal resources. We also have serverless environments, and at the same point in time, we are working with CI-CD, CFLeft, and testing issues, as well as the fact that our users now expect insane gratification. So let's look how the Splunk APM helps solve these problems. In most cases though, in today's modern environment, you don't spend your time staring at a dashboard to see what's the status of your product is inside of here. In most cases, we find out that something is going wrong when we receive an alert. And in this case, my alert has come in through our on-call product. The on-call product has told me a couple of things. I can see, for instance, that I am the person that's on-call for this. I have been paged for this. The triggered alert has happened. And that I've had other alerts that I've actually resolved. But let's step in and take a deeper look at what's going on inside of this. So here I can find out a couple of things. First of all, let me acknowledge that I've been paged for this environment. And now I can start looking to see the type of information that's being told to me inside of here. My error rate over the last 10 seconds has been quite high. And that it's happened over at least 10 requests. I'm trying to use this to avoid a storm of events that are coming in when they all are actually the same event. With this, I can do other things. I can see whether I've had similar events inside of here. I can find out who are the other stakeholders, the people that need to keep track of what's going on inside of this as well. And I could, for instance, step in and take a look at more details around what's going on. From showing the alert itself to finding out what Runbook would suggest that I do to resolve this, as well as looking at the alert details. And I can see, for instance, that I've had a couple of different things here that have had an alert that happened showing me an increase past my threshold. It dropped off, but I have a new one that's recreated this. So in this point in time, I've seen these two pieces that are happening. So let's take a little deeper look. Let's look at this alert in Slunk's APM tool. Here, we're looking at the Slunk APM. And again, I can instantly see what happened to cause it, what the triggering functions were. I can look at a more detailed view inside of here. I can see similar information that's telling me what was the actual boundaries that were broken to create this alert in the first place here. And then I can, for instance, step into it. And we're going to step into this and take a look at the APM tool itself, the application performance monitoring tool. Here, for instance, I'm stepping into this. And I can see that I'm in a troubleshooting mode. And interestingly enough, the issue is being pointed out, is on checkout. And that it is not actually the underlying cause, not the root cause, the lowest instrument in service is reporting an error. So with that, let's actually take a deeper look into this application space. What you're looking at here is part of the service map environment. And the service map environment is built for you. You actually get to see all of the different pieces that you're interested in. And in this case, I'm looking at API. API has told me that's where my alert is. It can show me the pathways that are going inside of here. And I'm going to step over and actually take a look and see what the dashboard is telling me. With the Splunk APM tool here, we're looking at it and we get the data in real time. Our data is streaming in. So we can see what the activity is as it progresses and as it changes. And so in this case, I'm showing a 10 second granularity of data that's showing here. And I can see what's happening. I can see the increase in activity. My latency has actually gone up across here. I can see that my error rate is actually quite high. And from here, I could step into different things and copy, share this information, download this information or even look at traces from this. However, applications do not live by themselves. There's multiple sets of applications inside of here. So we also need to make sure that we're looking at what the underlying metrics, the host metrics look like. Or in this case, since I'm actually living in a Kubernetes environment, what the Kubernetes environments also look like. These give me the capability of looking deeper into the product without having to do swivel share integration of moving from one screen to another screen across the all seven pieces. So let's step back real quickly, step into it. This is the overall service map that we're looking at. Service maps are generated for you. They are something that as the requests go through your system, the APM tool keeps track of it and builds you the map of what's happening and where that request is being touched. We touched on this a little bit, but for instance, we can see that an alert was triggered here. And then we have various pieces of information that are telling us more about this. We can see that API. So let's step into API in this case. And rather than going to the dashboard, let's look at troubleshooting. Now troubleshooting give us more information around what's happening inside of our service map. The size means how many requests are going through here. The circles can indicate that there's an error rate or that there is a root error rate. Again, root being the lowest instrumented services reporting error. The lines in their magnitude give us more information, as well as the fact that the error rates, the colors can also give us this information. Over here, I can see what its dependencies are. And in this case, the inbound dependencies API was the starting point. But we can go to catalog and check out and see what the two underlying pieces are. And again, we can see that we're getting reported errors, but we're not seeing this root indication happening. Let's step out and take a look at the next level down, request and the error capabilities inside of here. First of all, this will tell us a huge amount of information, all the way down to actually showing us what's happening and where they are seeing that magical root errors show up here. We're getting information, our cardinality, our tagging is showing us what's happening with each of these pieces inside of here. Let's simply take a look at this. We're seeing a huge number of errors coming in from payment. So let's take a check out and add that to this. And let's add payment to our filter environments. With this, we can now see that we have errors being reported. But down here, payment is actually showing me that it is the lowest service reporting an error inside of here. And now seeing over any of these, I can get more information inside of here. I get to see what's known as the red monitoring pattern, the rate, error and duration for each of these pieces inside of here. And again, I can see the amount of duration that's going between each of these pieces. Interestingly enough, I can also do a deeper analysis dive. And let's take a look real quickly at that before we just go further. What's different is that the analysis now takes a look at the unique service and shows me all of the unique information in a sense cross-correlated to all of the other pieces. So with this service payment, we can see what's happening. The endpoint payment execute is showing me that I got 8,700 requests coming in, but that there are 1,900 errors that are showing inside of here. And this is the lowest warning for it. We can also see what the environment looks like here, what its method and HTTP methods look like, what status codes were reported inside of here, who is making use of this. So we can do the breakdowns across this, as well as what versions are showing up across this. This is an amazing amount of information. And in fact, you can even step into it and take a look at the simpler traces, the example trace data that can show you even more information. So we, for instance, know that we have a 402 failure. That's where the root is inside of here. And as it passed back up the stack, it became a more generic 500 error. We can open up any of these things and get even more detailed information all the way down into seeing what the server was, where it lived inside of this environment, what its activities were, actually what kind of error, in this case, a timeout error that occurred going on across here. And we can continue to expand that to any of these environments that we want to. We can see where our application is spending its time. So the request is spending most of its time in the application, some of its time in the database, and some of its time is eaten by the network. And we can even take a look here in more details and see what's happening in terms of the operations as a span performance, how they are crossing over in each of the span pieces. That again gives us this tremendous amount of information. And again, we can take a look and see more details by making use of our spotlight, our tag-capable spotlight, for analyzing what each service is doing. Likewise, we could change that to other services and get looks at those environments. Again, seeing, for instance, this case, we have lots of errors, but they're not the root cause error inside of that. We now have narrowed it down. We know that payment is our probable culprit inside of here. And our payment information can show us lots of information. Again, we can go in and take a look at our tracing data, what's happening at any point in time, take a look at the traces inside of here, take a look at the waterfall plot to see what's happening, expand that to look at specific taggings, its information for slicing and dicing, but I'm really interested in where my errors are coming from. And now I can start looking at things and start dealing with a breakdown chart. So let's take a look at this. We saw this a little bit, but let's see what it looks like when we break this down. We can see that we have problems with our gold and our platinum users inside of here, but they all lead to payment. If we take a look at payment, let's do a breakdown that says, I remember seeing that I had two versions. So let's take a look at the versions themselves. And we can see that both of the versions are reporting problems. Let's take a look and say, okay, how about nodes? Again, an application problem can be shown and be caused by an underlying infrastructure problem. And in this case, we now have a smoking gun. Two of our nodes are working just fine, but one of our nodes seems to be having problems. It's not passing data through. And in fact, every year I can take a look, Kubernetes node that's causing problems. It's reporting the root errors. And I can step here and go directly into expecting it. I can also, for instance, jump immediately into a log environment by jumping into this long logging environment for this. I'm going to step over and take a look at the Kubernetes node itself. So here we are now mapping across what's happening with Kubernetes. And the Kubernetes environment here is showing me the node detail. I can also look at the overall cluster detail. I can look at the workload detail. My workloads are up here. My containers are down here. I've got CPU utilization. Nothing looks really out of the ordinary here, networking, and all of the various pieces. But I do notice, for instance, over here that I have a memory that seems to be a little out of bounds. So let's find out what's going on with this robot here. So now I'm looking at the details on the container about this, my container properties, the CPUs. Whoa, my memory is heavily being used inside of here. And I can see what's happening in any of these pieces. Most times you would have to step out, again, swivel-trade integration, to step out to a terminal interface to try to find this level information to actually look at the details. But we're including that so that you can see what's actually happening all in one view. I notice, for instance, that my memory limit is set to unlimited, probably a configuration error. So now I know the quick way to respond to this, to get things back to stability for this. But I also might want to spend some time and actually take a look at what's going on inside of the application space. So I've obviously got something that's eating a lot of memory. I've got a memory leak someplace. But I can simply roll back or change my environment and start taking that, changing this memory limit to something more reasonable for this. But I can also now step into this and we're going to take a look at this plug integration environment. This is actually a plug investigation. I can see what the average container response times are inside of here. Since I actually did respond, I did a rollback to my environment to remove a problem. We've configured the system so that my system is now matching across it to the memory allocation should be. But now I can start taking a look and see what's happening inside of this. Inside of this instance, the container, here's the user that was doing this. There was an exception thrown. It's a Java heap exception. We can see exactly where it came from inside of each of these pieces. And we can see what host was driving this information as well as the various other pieces where the source of logs came from and what the source types are. We can also see that this has happened or in order. And we can see that there's a fair number of these that have been happening. Obviously, something is taking too much memory. Therefore, we're stuck at this point in time. But now we've gone from the initial metrics viewpoint. We've actually gone from an alert based off of those metrics. We've gone in and taken a look at the application space, determined what the application is happening and where in our increasingly complex, convoluted application environments where the error is probably occurring. We then take a look inside of this and can determine how the infrastructure is impacting our application's ability to do these things. And then we finally can actually step all the way into the logs so that we can determine what the underlying structure is happening and how we can resolve this from infrastructure to application and make sure that everything is working hand in hand. Thanks and enjoy the rest of your day.