 I'd like to thank everyone for joining us today. Welcome to today's CNTF webinar titled The Cybernetics of Observability and Monitoring. I'm Carlisa Campos, senior member of technical staff at VMware and also a cloud native ambassador. I'll be moderating today's webinar. We would like to welcome our presenter today, William Louth. He is a complexity scientist at Instana. So before we get started, for real, just a few housekeeping items. During the webinar, you are not able to talk as an attendee, but we do have a Q&A box at the bottom of the screen. You can access that. Feel free to drop your questions there. Actually, we encourage you to use the Q&A as opposed to the chat, because it's much easier for us to manage it. We'll get to your questions at the end of the presentation. We'll stop around quarter to 11 at specific time. And so we are all clear where we are. This is an official webinar of the CNTF and as such, it's subject to the CNTF Code of Conduct. Please do not add anything to the chat or questions that would be in violation of that code. Basically, please be respectful of all your fellow participants, presenters, and hosts. With that, I will hand it over to William to start the presentation. Thank you, William. Thank you. Thanks for the nice introduction. Good afternoon. Well, good evening in terms of where I am based in Europe and Holland. It's probably a good morning in the US. As I said, I'm William Louth. I'm a complexity scientist in Stane. Just a little bit of background on where I'm coming from. For the last 20 years, I've worked in the areas of observability, controllability, and operability. So observability, I mean, that's a new thing. I probably would have called the monitoring before, but nowadays we're referring to observability. And there are my areas, at least in regards to Stane, is I'm looking for new ways of building observability in the systems. New models, models that like you would consider today would probably be tracing logging and metrics. And I'm looking at new models that will scale to the complexity that we have in modern applications. I've also spent many years building controllability in applications, quality of service brought up from the network of applications and resource management. In this talk, I won't be touching on operability, but I also work in operability in Stane in terms of visualizations, how we can help people learn, learn the models that work within their particular architecture environment, and also how the intelligence of the product manifests between the human and the computer or the machines. Now, so probably to give a good perspective on where I come from, it's slightly different than most of the other speakers in this field, which is, I mean, we have 20 years of monitoring even before that, but observability is relatively new in terms of the term itself. So I see observability changing in two areas, two directions. We started from logging, and we've had metrics and tracing where we are currently now. And I see that going in two directions where one is more of a high level, more operational, effectiveness, and DevOps and human or humane. And there I see we're going in the area of signaling. This is something I will touch on in the rest of the talk. The other area is, so signaling is very high level, so it's looking for a model that helps with effectiveness and it's operational. And then the other area is where if you look at logging, we've gone to tracing, metering is another thing. Tracing in today is only about clock time, but metering is about any type of resource. So that's probably the big difference between tracing and metering is that you can have multiple measures between any span or any type of event that you're measuring. Mirroring is where I imagine it's going into what I've built in my past job in Ultalethics or Jinspire, which is where you basically mirror the machine's behavior over into another machine. And then I think the future will be around simulation. And there with all of this, what we're doing is very detailed. So the other one is for the DevOps group. And the other area is really trying to give people more diagnostics capabilities. And it's really about reconstruction, explorative of the reconstruction itself. It's focused on the developer and it's really at the machine level. And I imagine in the future it will really be consumed mainly by machines because I don't think that humans can work at that level anymore, not with this type of persistence of our building. Okay. So to bring in cybernetics, I want to talk about why we need to look at cybernetics review of observability and monitoring. And that's the problem there is the challenges we're talking about is complexity. So what's complex? Well, I mean, we have complicated and we have complex, but what I consider is we look at complex systems. And complex systems are really dense networks. So this is a lot of interconnection and interactions in those interconnections consists of adaptive agents. Now, the agent parts some people chose people because we've had agent technology before, but really to see agent, you could probably say it's a microservice or a library. And we'll come to, you know, cloud is an example of adaptive agent where you purchase a, you know, purchase a service and the service is able to scale up and down, depending on the workload. So and in doing that, it's adaptive. So there is adaptive nature already in our systems today. We are working on complex systems work on multiple scales. And that means that generally it's feedback loops between things and our boundaries and but also our time scales and space scales. And that sometimes feedback loop of one is feeding, is signaling toward a higher level feedback systems. So it's not easy to to drop down layers and still understand each layer, whether looking at the other layers above. And dynamic states is determining what is the current state of a system, current state of a particular subsets of the system, or the state of an entity and that because the environment is changing because it's quite vast. And that's, it's a very dynamic system. And that means that the states of themselves are dynamic. And, okay, so what we started with when you think of topology, so whether you think about interconnections, you could think more on a hardware level, and you talk about hosts or even always precisioning and you have hosts that we've seen containers come along. And then we've seen clusters. And then we have services which have low balances and multiple instances. And you can see in all of this, we're multiplying components, you know, the number of components is increasing. And what we've had is we've had the monoliths where they had some kind of libraries, the, you know, components were embedded within them. And then we have microservices rippling into increasing the modularity of the system. And then, you know, the natural progression is probably to flows and functions where just a pattern of we have self-organizing systems. Or you could say it's the developer, organizing books. And there's some organization within there. Now, the connectivity, which is important to the, to the complexity itself, the important aspect of it. And we can see that in Santa. Like, this is actually a screen, so one of the customers have been standing at the hats. And has, I haven't put the whole number of nulls in there because it overclustered it. But behind that, that line is not some arch form I did. It's actually taken from one of our customers. And there, you know, as you can see, the number of nodes, you know, the APIs are large, the number of dependencies between those and the, the size of the flows, flows, though, which could be considered as a workflow, you could consider as an endpoint within the servers. And you can see that it's, it's getting overhand. It looks complex, or you could say it looks complicated. So I said, cloud is an example of complexity. We have the self-service, the large network access, which is what we require in a complex system, pooling and elasticity. This is the dynamic nature of the environment and metered services, which brings in a kind of control mechanism there, where you can have rate limiting and, and service quality settings. And this of course means that the, the environment is changing on a moment by moment basis. Now, another thing with what we're seeing with complexity of course is change. You know, we've had previously, we had six, every deployment cycles for applications or service were like six to a year, months people would plan, they would do a lot of testing, and then they would release it. The bugs would mount up over time. And then there will be another large release. And then someone would hopefully force those bugs and introduce less, less number of bugs. And that cycle was quite big. Now the, the, the height here represents the number of issues that can come in and how long they stay there. Because the, you know, the, if you have an issue in production and the longer stays, then, you know, of course then the greater the damage it is to the system. And so what we've had is the shorter, we've had shorter deployment cycles, you know, we're now dropping from six months to weekly or to daily. And that also means that we're squashing those issues, we're improving, optimizing the system. It's a continuous process of, you know, process of improvement. And it also means that the any negativeness that we have introduced, you know, in regressions or so is minimized. The impact is minimizing in terms of the time itself. Of course Anna, we're not sure of the, of the nature of the impact itself, but the duration of it is minimized by constantly changing our environment. And the future looks like it will just keep getting even more fine-grained wherever we're having the linements and nearly automatic not even controlled by an operator like pressing saying, okay, I'm going to release now in the next 15 minutes or so. The problem with that of course is that the rate of change and the growth of complexity can put a stress, a stress on a company on an infrastructure. And there are two likely outcomes. You either have adaptive control, well, I wouldn't say just adaptive is a useful way because you're always adapting to what you, what you're sensing within the environment and how, what speeds you can go with change and complexity. And what you really want to do is to regulate the amount of change and the amount of complexity to bring it on into your system. And if you don't able to do that, so that's kind of a steering mechanism for the company. And the company also has to change in line with the rate of change that is happening within its environment and also within its own infrastructure. And if you're not able, if an application or service is not able to keep up with our company, then the natural thing is for it to collapse on the distress of that. And the way to see this is kind of like, I don't know, this sometimes is a hidden missus analogy. But if you remember in terminator two at the end that we had the two 1000 or I think it was the one that was liquefied liquefied, you could see that China constantly changing shape and it couldn't stabilize. And that's what happens really in these kind of environments where rate of change and the complexity keeps getting out of hand that the company doesn't know what or the infrastructure is not able to stabilize. And stability is really a sense of memory. And this is a challenge in operations and observability. Is the environment constantly changing? How do you create a model? Or how do you reason about something that that reasoning can only have 15 minutes of validity? That's the problem. So now I'm going to move on to cybernetics. So that's the problem statement. And cybernetics is kind of like a solution or at least a solution to how we can use observability and monitoring to help us with the complexity and with change. So cybernetics extended definition is a study of control and communication. In the original book, it was about animals and the machines, humans you could say there. And really, the two key things there are control and communication. And it's very like what we see in the DevOps environment. Now, okay, what does control and communication typically when we think of communication in cybernetics, we're thinking of signaling and the control is some type of action or response to that. And you can see a little bit in application systems we've had now today, we've had the reactive manifest or reactive systems and reactive platforms, which is a response to how to handle workload and how to handle variability in the system. So classic example of cybernetics comes down to always the feedback system, the feedback loop. And I'm taking this really from the manufacturer and then I'll try to relate it a little bit more to something else. So there you have in a manufacturer system, you have an assembly line, you have some kind of input coming in to a controller. And the controller has both the input, which is some kind of supply system. And then there's goals given to the controller to regulate that flow. The flow passes along into some kind of, you know, we have activities are being performed in the process that outputs, we output how when it comes out of the into the process, we have an output that is measured by a sensor and that feedback, you know, that's those measurements are fed back into the controller. And this is where I can see is like it can regulate the amount of flow that's going through the system. So if you can think of it like a valve, if you open up the valve and then you can have a measurement at the other end of where it outputs, you can see as you're opening, does it increase the output is a linear to the input or is there some degradation when you start to see the certain rate of flow within the system. And that could see that you're slowing down and then you have to take a corrective action. And that's typically what you're looking at is a feedback system. That's cybernetics. Another way of looking at is coming from the systems thinking or system dynamics where you have an inflow. It's more abstract here. It's more there's an inflow coming in. And there's a stock system, which is kind of like a resource that you're managing. So an inflow will add to a stock. And then as you go outflow, you take from the stock and how you regulate the system is the amount of stock that's in there. And from a program in background, you could probably see that as more like a semaphore, a pool and all the number of tokens or tickets that you can take. And you regulate then your system by looking at how much you give into the stock and how that changes the inflow and the outflow at the rate of dose. A little bit more abstract here, but getting more concrete, if you were to manage maybe like if you're thinking like quality of service, where you had something that had to begin a task, you would have a resource pool, which you could see that as worker threads or something like that. And when you begin an action, so if you look in the middle of this chart here, you have a begin and that takes a reserve, something from the pool. Of course, you take it from the pool itself. And that's the minus sign there. And if it's not there, it doesn't generally do some kind of blocking. And then after you've reserved that, that means like you've got your token, think of a whole restaurant, you reserve a table, you get your reservation and off you go do your work. So and then you put that into your reserve pool. And this could be like a reserved two seats. And then later, maybe I have to ring up and someone says, oh, maybe I make a change and I need another additional two seats. Well, I'd need to remember what I already have. And then I asked for another two. And then so you have this reserve pool, which is what you've taken from the resource pool. And then when, of course, when I go and have my dinner with colleagues, you and then you're completed, you release that. And that goes back into the control system. And that allows other people to come to the table. So control there is how much capacity you put into the into the resource pool. And then at the duration of how long someone is in a reserve state and then till they get to the release. So this is where you can find as I can share some links in the Instana website later on this for the information on this. So cybernetics really comes down to feedback flow systems, control about it, controlling the feedback and the flow, and then the communication, which is the sensory feedback, the signals that are going through the system. If you think of DevOps, DevOps is very similar. It's about feedback on the changes that you're making and how, how good those changes are in production, the workflow itself, how much flowing or how much the rate of change that you allow going into the system. And this is where you put in some control. Some people call them error budgets, other people look at the risk levels or so to get more formal. And then of course, you're communicating and communicating between human to human or animal to animal. And then there's communication between man and machine. And even within the process, you know, people won't like to know that there's a system needs has done a change. And that's change can be replicated to other systems, sometimes to a slack to an integration, or even to a standard where we have a pipeline feedback mechanism that allows us to tag a window in the timeline to saying that there was a change here. And that's useful because you're always communicating with other agents in your environment, and those agents can be human or machine, machine here being the slack. So cybernetics then, if you see then what we're seeing where Kubernetes or Kubernetes is that we have operators there that are very dynamic and changing topology changing how the system is structured. And you see also humans are doing a human operator. What's going to the challenge in the future will be and how you can bring those two together to able to share, you know, responsibilities, coordinate and collaborate on it, and even handover tasks from one to another in that. And I hope that I believe that cybernetics is probably a way that we need to do that. And we have to look at models of how we can model the system of feedback loops and signaling and resource management and how we can get the votes, the human, the DevOps person, and the operator, the software to understand the system both in terms of resources, in terms of resource modeling and the policies that we have with that. Now, of course, this all sounds like you're looking for something, you know, when you think about this, what we're really trying to do in all DevOps is always have some intelligence. It's intelligence. Yeah. And what does that mean? Intelligence is action appropriate to the context of all these are the forms of intelligence in terms of reasoning. But when we think of DevOps, which is a very action oriented, is that we're looking to make an appropriate action to the context we're in. And then the next problem is what is that context? Because context is the environment, the system, but this is something that is changing. And this is where observability comes into it. So context. Context is the circumstances that form the setting for an event. Sorry, the lag gone off here. So what we're really talking about there is the setting we would naturally think of in terms of an environment and events that are happening within the environment. And they also construct the environment, they have an impact on the environment itself. So when I start with the Instana was formed in 2014, at least the inception part, which is where I came in and then I left shortly after that for the reasons. And I was asked by the Instana group to come up the team itself, the inception team, to come up with how I imagined the future of observability and monitoring was going to look like. And I always believe that the tooling that we're trying to create is really trying to create a story and narrative. And so I look to the film industry, look to film to help explain like what we should be doing, what we should be trying to construct, what is the context we're trying to reconstruct or simulate. And film, of course, you have these kind of things here. You have a setting, a scene, an act, and there's sequence there, and then you have something, you know, there's actors within the scene and they have some behavioral changes there. And of course changes within the environment happen. So, and these are all happening at with lower level of events. And so this is what I mean by context. Context is really the reconstruction of an event within that environment, at least to enough that it can make sense to us of what the system looked like at that time in terms of the nodes of the apology, what flows were happening, and what it was the health of the system itself. So that's what we're trying to do always observability is reconstruction. And so, but the definition of observability, which is has changed, you know, in the last few years, the original definition was the inference of internal states of a system from the knowledge of its external outputs. So the state there where we're talking about would probably be more health state, the quality of what it was doing, was it operating, because observability was really around a measure, a measure of can you understand the state of the system and states tend to be conditions. And it's a very limited number of conditions, at least when we think about modeling, because if we didn't think about a system had a million states, it wouldn't really be a very useful model. So generally what we always do, we try to group them into categories of states where they represent this is a good state, you know, a healthy and okay state, the graded state, effective state. So we take these kind of higher levels of categorization of a state. And that's what observability is you should be trying to do. So observability is and how we do that is of course, coming back from cybernetics is you're always looking for signals within your environment. So an observer is observing other systems. And this doesn't mean that the observer is outside the system, it could be once microservice looking at another service or agent looking at another agent, but they're observing each other's behavior. And those and certain behaviors are signals. And from those signals, we infer a state. But the best way to see this, if you think about animals, because you know, animals also have signaling. And animals like will make aggressive moves like they will move their hands up and shout and growl. And generally that signal is a signal of aggression. And, and if that happens, frequently enough, or at least consecutively, we probably unfair a state of that this animal is wild or, you know, or crazy or angry. And this is the state we infer from it. So an observer is looking for that. And that's the way we should be considering observability of systems. So observability today in terms of when we look at a legacy model, we think about traces. This is distributed tracing or even local tracing. We think of metrics, which is your little dots on your chart over timeline and log, which is echo chamber for our developer writing strings themselves. And in the past, we've had that where we put them into tables, they go into the databases on single data store or tables to map each of the types. And then we had a console where we query the dataset and chart it. And that's what we really had up to a few years ago. The modern way is really to think about sensors. And we're really moving into multi sensor environment. We have sensors who are making collections. These are generally being passed on to agents because we can't always get to the backend systems that we have. And sometimes we're writing to multiple backend systems. So just generally agents are placed in there and are really channels for pushing those collections off into another system. And there this other system is generally acting like a fusion. It's taking all the sensory data and fusing it into to make sense of it. So it's taking metrics, it's taking traces, it's taking logs, it's taking the apology, discover the configuration and then reconstructs a model, which is the fusion, like a topology of system. And the apology structurally could be, you know, the nodes, the hosts that are all living in. And the apology could also be the call graph. And there what it's focused on, of course, then once it's done a model is focused on change. But change is not really what we want. We know change is always happening. So that really is not a very useful thing to be constantly monitoring. I mean, you can monitor the rate of change, but it's not going to tell you much unless you have a means of classifying those changes. And then you look for signals for those and then to derive a status, because at the end of the day, what we're always trying to do is what determine the status or the state of one or more of our systems that could break into the subsystems. Just check them out. Okay. So that's what we've had. And I think we're also seeing in the observability spaces we're moving from. And this is where recollection was probably more of the legacy view. We go back to the legacy where we had these tables. And what we did there is we had search and capability. So we search, we do the recollection is what happened at this point in time. And so generally you look for a tag or look for a time, a window, and you search for it, and then you identify something. But the problem there is what should you be searching for. And this is, this might have not been so much of a problem before when you were working with a model that didn't change until, you know, it didn't change less every six months or so. So you could build up some knowledge. But when everything is changing rapidly, and there's very little memory in there, what you really want to do is the system to focus on recognition, recognition of similarities or, you know, recognition that something is even different. And then signify, you know, of course, once you can recognize something, then you can give significance to where and that can be, you know, insignificant or insignificant. And then what we want there is, is suggestiveness. We want the tooling or the monitoring solutions to be more suggestible what we should look at rather than those searching or exploring. We just simply, I don't think operations today have simply the bandwidth to go around exploring. And it probably would have been useful to previously when we didn't have such systems that we have today. But with the change, with the growth and the complexity, with the number of entities, I don't think that's scalable anymore. So maturity, we have to move to more recognition and suggestive solutions like inspections or advice tools for a human to scale as opposed to a machine. So why do we observe? So we come down to we think of observability. Why do we observe? Well, in the more the generalist term is to monitor signals. We always hear, I'll come back to what signals are later. But of course, everybody is, is taking measurements and hoping from in there, you know, that we know the measurements have a lot of nice and then we're looking for a signal. But no one really knows what a signal is. But generally, that's what we're looking for. We're looking to monitor signals in the environment. So this is why we observe. Why do we monitor, of course, is why would we just keep watching for signals? Because we want to control states. Our infrastructure is something that we're responsible for. So and the way we can manage that is by looking at the states of the system. So we're, when we look to control states, we're really trying to manage the service quality to manage the service that we're delivering. So we have to look towards control to help us to manage observability. It gives us that context. It reconstructs something for us. But at the end of the day, we still need to reason about it and act on it to manage our system. So observability. Observability is basically a subordinate or servant to controllability. And controllability is really where we're all going. And controllability can manifest in that there's a machine taking over control and self regulating itself or controllability could be just humans responding to what they see within their observability or monitoring space. So controllability is being able to direct or influence the behavior on the course of events in our systems. And how it fits in with observability is controllability needs observability for perception to reconstruct to create the context. And then controllability needs to have some kind of monitoring capability that tells us what it should attend to. And then when there is something of significance, what actions should it respond to? So that's where controllability comes into it. Now, I'm going to touch a little bit because when we come to cybernetics, you know, cybernetics is really about control, but it's about control of other systems. And you can also have second order cybernetics, which is control of the controller. So when you think about controllability, you can also apply it to monitoring because you have to observe. Observe observation is an action and observation. If it's an action that needs can be regulated, then you have basically some kind of control feedback loop. So monitoring, which some people would like to imagine monitoring is the old traditional way where you were your pin your server and you say, are you up or down? And I think that was used those are used usefulness to that because it would focus really on the state of the system. But it really didn't give us an accurate assessment of it. And that was unfortunate. But I think monitoring still is here today. I think we are associated, when we think about monitoring, we always associate with a technology or a tooling. And it's really a process. Monitoring is the process that manages how we observe our environment. So that's what I want to talk about is because there's always a bit of this observability versus monitoring and observability is even greater than monitoring or bigger than monitoring. I think observability probably looks at more data than monitoring, but monitoring is really about steering observability, because there's only so much we can collect. And we've had this with profiling agents, where profiling agents are have a depth of mechanism, they look at their own overhead and they adapt. And that to me is a monitoring capability, where it's determining, you know, is this event useful or not? And that's really what humans want. Our operations should be focusing on, is I want until that helps me manage the degree of observability within my environment, because I simply can't store everything. It also increases overhead in storage costs, increases overhead in cognitive workload for both the machine as well as human. And you know, you wanted to direct attention, and you want to integrate multiple senses. And observability today tends to be like sliced down or siloed like, here's tracing, here's logging, here's, here's something else, and here's like advance or so. And so you monitoring is really that fusion, it's integrating the senses, and it's reconstructing memories, because observability is really a collection system. It might allow you to explore, it might allow you to search, but it's not about reconstructing the memory, because we know memories are multi facets, and that means we have to integrate multiple senses, and we have to give a significance and relate it to what is of interest to us, which is how we make representations. So monitoring is about identifying patterns that observability is not focused on observability is about collecting them, and assigning significance to them, aiding the reasoning of the human and the machine, and guiding our action in there. And of course, we want feedback on the actions that we take to help regulate our observability, and also what we've done in terms of control itself. So action can also manifest in action towards observability, or action to the context of the environment that we're doing here. So what I'd like to see then is there, but how I paint monitoring and management versus observability and controllability, so observability is seen, and the controllability is reacting in action on it, is that there's a strategic track or lane, and then there's a tactical, and the tactical is the day-to-day operational observability and controllability. The operations, what control do they have in their environment? And the observability feeds data into the monitoring, the monitoring feeds data signaling or, you know, regulations, policy back into observability. And controllability is also feeding into monitoring, and monitoring is also helping to regulate also the controllability there. And it's the same with controllability is feeding into management. Management is really management's view is how much change do we want to take in our environment? How can we do that change, which comes down to controllability? How can we monitor that change or the success of that? And then how do we see that change is happening? And that's probably the way to see it, switching zigzagging from the strategic and tactical. And there's a maturity track in that, so maturity is really going from observability to monitoring, as in at least my definition monitoring, which is the modern version and not in terms of product specific like pinging. And then the maturity is from the top to the bottom, observability to monitoring to increase controllability in the system, both machine and man, and scaling that, and then of course to management, which is always about the policies of that change. So observability is looking, monitoring is seen, controllability is acting, and management is regulating. That's the way I like to see it. So just another way of seeing observability and monitoring is observability is given a sensory data, and monitoring is the semantics that we attach to the sensory data and the significance we derive from that. So you can see that the sensory data as it moves up from the bottom to the top left hand corner is really getting smaller, and that's where we increase significance because we're really lucky to all to get greater significance by crunching down that data and turning it into information. So well, some people call it data, but there's some math in there. Okay, effective monitoring then depends on connecting, on contextualizing. And when I mean connecting, I really mean rebuilding, reconnecting association of we have made a change, that change is rippling into the environment. We then connect that back to the control that we want to have on that. And then the context is building, rebuilding our environment of what it looks like. So an observability to is always rebuilding context. It's building a memory in a moment of time where it says, this is what the system looked like. This is what the apology looked like in terms of hardware, containerization. Containerization I mean in a very general sense is one thing is contained within another, which is the isolation, and then the flows that cross over those boundaries. And so what we want to do is keep speeding up that change, connecting that change back to the control of the change that does happen, the effects within the environment, and then relating that rebuilding a context, a new context that represents that new state. So we're in the home straight now. This is where we get on to cognition. And I'm I'm sorry, I'm just looking at my time. So I think I got nine minutes to get this over, probably okay. And then I'll take questions for the last 10 minutes. So cognition is the kind of it's I've changed the color here to purple because I see this is where we're heading. This is the future. And this is really what we're all looking for. And like when you buy a product that says AI ops, or something that says intelligent deals, and you know, you're really looking for something that is cognition within what we're collecting and what we're monitoring. So what is cognition? And that's the process of acquiring knowledge, understanding through taught experience and the senses. So experience could be the reconstruction of episodic memory, or even the memory itself, living it. And of course, we have to do to create an experience, we need a sense, we need the sense of the environment we're in. And then we have to think through that we have we create representations from the experience and from the senses. And it's not sequential, of course, because taught itself impacts experience and a sense it's there because there's regulation there. So the more attentive you are to something, the more you see the more the thought you can consider on that to give the thought that you give to it. And then the experience heightens. So it's the circular processes between those systems. Now, my interest is that what I see how we're going with microservices and to other types of systems of modularity and flow is that it's a social network. We're seeing nodes come up there. And we're seeing microservices not be just more of a call, I make a call to something else, and it returns a response. What we're seeing today with microservices and other types of messaging system is that there's a lot of metadata being transmitted, that the interaction that we're seeing with services is not send and give me the data back. There's more like, oh, I'm going to drop the data because I'm overloaded at the moment. Oh, there's some rate limiting. So I'm going to have to slow you down for a bit. There are some handshaking going on. So we've gone from RPC, which is, you know, the remote procedure calls to what I consider conversations, conversations between services. And you see this with the reactive manifest where they're trying to signal when they are available to or when they want something else to be recreated and consumed by them. There's a mechanism going on beyond the call itself. So there's like two channels. There's the data channel, which is here's the request, here's the response, but then there's the control channel within that, which is, you know, whether I'll accept it now, whether I've taken it, but I've actually dropped it into a queue and I will give you a call back later. And if you see that, it's a very, it's very human-like. So so I considered in, you know, where we're heading is kind of a like social cognition, which is the study of how people process store and apply information about other people. And I think that's probably the way to see the future of observability and monitoring, and especially in the microservices, or any type of service where it could be down to even a Lambda or a task or an event handler is, is we're looking to how services process, store and apply information about other services in a system context. So it's not so much now about the workflow that we're looking at, but also we're looking at other services. So if you think more concrete, maybe it's a bit too high in the sky with that one, if you think about you have a second breaker, a second breaker is kind of cognition, is that if when something breaks, if you try to access the database and a few times it fails, then you put it into you kind of you, you say, okay, I'm not going to go near that database until it starts to I'm going to refuse the next few requests that come in and say that the service is not available and wait and not overload the other database and then tell them to call back later or refuse the connection there and then before talking to the database. And what you see then is a kind of a sense that like what we would have in the workplace, where if a manager was walking over to someone's desk and before we had these old paper cabinets and people with trays where people would put the work in the tray, if you came over to a person and you seen their work in the tray was, was quite high, you wouldn't drop the new work in there, you would probably look for another service to do that. And that's really what we're doing, we're, we're seeing services take on this new semantics and it's actually quite challenging for observability tools because we can't know the distinguish between, so let's say you service talks to another service and the first few times it's time sold or fails and then it does have a retry policy and then it succeeds. Now, that service where the caller service will report back to its client that it succeeded and we wouldn't be aware to if it succeeded and what it needed to do because it eventually got there. And even though the other service failed in terms of the conversation with the calling services succeeded eventually. And so how do we do that observability? Because how does a trace tool focus on trace know when, well, okay, we are, we're, we're, we're allowing failure, but so I can ignore these traces because the final trace succeeded. So we, we can't seem to capture that today with distributed tracing. So we have to do something different. So I see what we're doing with service condition cognition is how services process and store information about other services is, is looking at how services interact with each other in a more dynamic and adaptable way in sensing the other service. So if you sense that the time of the other services slowing down the response time, you might change your own behavior in terms of how quickly you call it. Now, the way they would do that of course is signaling and there is kind of signaling already there today. It's like HTTP codes have that where they can say I've dropped you or I'm overloaded. So we do have signals already embedded in systems. Yeah, we just need to formalize and create a universal signaling glossary that works across whatever protocol I have. So signals are involved to convey meaning and influence to receivers. So the signal in itself is to tell something information. And of course when someone receives a signal, they are meant to change the behavior. And that's why people send an error code back or a HTTP code back because they say redirect this or I'm dropping this or I'm delaying you or this is to try to change the behavior of the caller. So senders obtain effects and receivers obtain, yeah, thank you, obtain two minutes to go, receivers obtain information. So the cycle is very similar to the feedback systems where a signal is transmitted, it's transmitted over some kind of medium, there's a receiver, the receiver records it, he reasons about it and responds. It already already feels like cybernetics and feedback feels like what we do with observability. So a signal is very different than a message or from a log in that the signal just the signal itself has a meaning. There is no difference between them. You don't have to unpack the message and look inside of it. This is where a log in fails is that logging is not a signal, it's a message and inside of that we hope to derive a message. But really what we should be looking at in the future is more focus on just sending the signal, the meaning directly. And of course how do we do that in the environment is well we can look for a solution like StigmaG which is a mechanism of coordination and through an environment between agents and actions. And typically you have an agent or a microservice that creates actions. These leave signals or signs within the environment and the medium. And the kind of classic example is ants going around when they find foods, they put a pheromone down on the surface of a, you know, like a grid and then a patch and then another ant picks it up and follows along. So this is where basically the history of previous actions affect or stimulate or dampen the effects of future actions. So the effect stimulates and it could be both negative or positive and action and produces more effects within the environment. And that's what we probably need with signal. What I'm working at in Steine is on signifying and whereas where I try to create a universal signal in language and mechanism for inference of state around services, resources and contexts is context of the environments where existing in the resources are scheduled like Kubernetes and services, which is the systems microservices that we have. And this is the final slide. Just up there. So just to recap, when we look at what I think we're moving to is it's not going to scale with metrics, traces and logs. We have millions of metrics. We have even more of traces and logs and people just don't know how to give significance to them and classify them. And it's never going to happen on the back end. So if the metric is not already has a signal and has significance at its name, it's more than likely not going to be useful to management. So what we need to do is bring in a newer form of observability technology around signals that can translate into states that we all agree like state is page, we have these various the states. And that's and that's is how we drill down into metrics and traces. We look for states when states change to something that's interesting. We look at the signals that generated that from the signals we we determine the time windows which are useful for looking at metrics, traces and logs. So I'm finished a little bit wrong one minute. That was great. Thank you William. I'm surprised I got through it. Very good. And now we have some time for Q&A. Again, please drop your questions on the Q&A box. It's at the very bottom there on the off the screen. And I have a question here for you from Christian Hayden-Rick. Apologies if that was mispronounced. The question is in the beginning you talked about metering being an evolution of tracing. Can you tell us more about what that means? Yeah. Yeah. Okay. So when I worked my previous company, which is Autolethics or Jinspired, I focused on creating a matrix for the machine, the matrix world. And what I did there was and how I distinguished that between tracing is that CIMS, which is what the name is like CIMS is in the game, but it has an SZ in it, it seems try to reconstruct behavior, reconstruct it, how code executes. So if you take at the fundamental level what code is in terms of what you know how it runs in most systems is there's threads running, they execute or sometimes nowadays go routines, they execute calls which are stack pops and you know pushing and popping out stack frames. And then of course this is how we create scope of execution. So what I did with CIMS was I said well fundamentally this is what all machines do, they're just pushing and popping frames. Now ignoring the code that happens within it, you know, there are some calculations, everything else is about methods being calling another method. So what I did there is I instrumented the JVM and every method call was creating an event and that event was streamed over into another environment where the event replayed all of the event stream. So you could think about event sourcing, but what's different about it is that it wasn't executing the code, it was mirroring the code, it was mimicking it and it would recreate the threads and it would actually create a pop push and pop the stack frames like they were there and it would even delay them to the duration that they were in the environment. Of course you could also speed it up if you wanted, but it mimicked behavior and probably the best way to see that is like if you think about a mine artist on a street where he is cleaning the windows, we know he's not cleaning a window but we know from this movement that he's mimicking the cleaning of a window. So what I did there, mirroring is basically mimicking that. So mirroring then in terms of tracing how it differs, tracing today is not really being designed to allow the reconstruction of the replay of a machine like it had happened in the real world. So Sims said let's create those events, stream them in and have enough information in it that we can actually reconstruct the flow of it and to another product or another monitoring tool it would look like I am the actual application and I'm doing everything even though you wouldn't be changing the bank account twice and that's what mirroring is. It's the reconstruction of, I hope that's explained it. Okay. It would be like a one way to do it. Actually I think the best analogy and I should have said this was you dream. So today we do an action, we use the same mechanism, our same system in play when we're looking at something and we're doing something in the real world but when we dream, our dreams are very vivid, we experience it but we don't have action. We don't go, our control system is turned off and that's what Sims was trying to do. It says I will replay your memories but you will not be a danger to yourself. You won't be sleepwalking. I'm going to ask you the next questions. We have five more minutes only but before you answer this question, William, would you tell us very quickly how people can reach you for other questions that we have in the queue? We're not going to be able to answer all of them if there is a way. Yeah, well they can always link in or my email address is william.lout at instana.com. William.lout at instana.com or if they can just have a chat. Oh, I also have a Twitter account, Autoletics, A-U-T-O-L-L-E-T-I-C-S. So there's also a Twitter account, Autoletics. And Autoletics is actually a name for flow. It's from a book. The book is Flow and Autoletics is type of people who like to experience flow. And that's what I try to do. Tons of learning on this webinar. We have four minutes left. This next question on the Q&A box from Luis Sanchez. Could you comment where instana is in terms of the levels of your less light in parentheses, the pyramid from traces logs to states? Yeah, well instana is definitely covering all the boxes in terms of the log in the basic of observability. And the reason I've rejoined in instana and came back in is to bring in the signifying technology. There will be announcement coming out shortly in that. But instana has pretty much covered the whole observability space. But of course with the growing complexity is and we have to come up with new management models. And this is the feedback system. So instana is branching out into other areas and looking at how do we see feedback? How do we better support DevOps and not just diagnostics? It's great. It's got great diagnostic capability. And it's also able to discover and automatically and dynamically the context. Reconstruct your topology, reconstruct those call graphs. And that's wonderful. So that's basically a great foundation for me. It's not just taking traces. It's also got the environment context. But what it would miss today, even though it has it to some degree with incidents in the product, is this more state management is significant. And probably it's still early days for people to see why we need new observability, something like signals, because they're still trying to get, they're still moving to microservices. They're still bringing it over. So they haven't got to the scale out yet. And they're still, you know, working with observability tools. And there will be a moment where they kind of feel that these things are not working for them. And I recently did a talk where someone said that, yeah, the manager in the company had said, and this is more the business process level, he said, I don't want anybody producing a metric that I cannot associate an action with. And I think we haven't came to that realization in the space of observability and monitoring. We're in this kind of big data Hadoop base where everybody is just give me more data and it's grabbing. It thinks if I keep collecting more, that's going to be useful. But we're going to get, as these systems evolve, and in this greater adoption of microservice platform, we're going to see these kind of problems come up. And there's going to be a need for a language that can help man and machine. And I hope Signify will do that. That's what I'm working on. All right. Thank you so much for the answers, William. This is all the time we have today. Thank you so much, everyone, for participating. And we look forward to having you in our next CNCF webinar. The recording of this webinar will be out later today. Thanks again. Thanks, everybody. Coming. Cheers. Bye.