 Hi there, CNCF community. Today, we're going to be talking about a very interesting subject that is near and dear to my heart. We're going to be talking about open telemetry, distributed tracing, observability, and a bit of an architecture overall. My name is Tamimi. I'm a developer advocate at Solace, and I'm involved with all things developer enablements from our Solace community forum to all the demos and videos that you guys are seeing right now. And I'm Rob Pumpkins. I'm a senior principal core product manager here at Solace. And so I've been delivering solutions to customers from financial transactions to transportation to retail over the last three years and having a blast doing the event or the architecture. Fantastic. Let's go ahead and kickstart this with open telemetry. And in general, what's open telemetry is all about. So Rob, can you please tell me a little bit about open telemetry, how it fits within CNCF, and the overall open telemetry or what we're going to be calling OTEL throughout the video? How did it fit within CNCF? So open telemetry or OTEL is the second largest project within CNCF. There's a lot of great interest in it. Because what opens telemetry really lets you do is find your way to normalize your tracing and telemetry information into a standards package that could be distributed to pretty much any observability tool out there. And observability has been around for many, many years at this point. And it just continues to grow and is becoming an integral part of most enterprises' infrastructures. And as we know, observability has three main pillars to it. We've got the traces, logs, and metrics within observability. And we will be talking more about traces, and in particular, distributed tracing. As Rob has mentioned, OTEL has a very big open source community behind it that supports it. And we will be referring to this open source community throughout the project. So let's look a little bit more into kind of event-driven architecture and what event-driven architecture is all about and how we can blend in the two worlds between open telemetry and event-driven architecture and what are the advantages for these activities. So we'll start with event-driven architecture in general. Right. So event-driven architecture is a very hot topic. It was huge at AWS re-invent, core to cloud-native capabilities. And really, it's about decoupling your applications and your APIs and letting any type of application or service talk to any other type of application over a constant messaging infrastructure. And to do that in an asynchronous manner, based on events, that's event-driven architecture. And really, it's built around a mesh of brokers. And we deploy brokers in the event-driven architecture to move messages along to their target destinations. And there are many, many different types of attributes and use cases for this kind of technology. So for example, you could use the event-driven architecture for analytics. And this is very commonly done today to communicate different types of information between different services to analyze your business information and provide guidance going forward or make recommendations. Or do security checks and look for security vulnerabilities are all kind of done in sort of an analytical event-driven architecture. There's also an operational use case around event-driven architectures, where you're trying to combine many, many different components within your enterprise so that they all work in unison, no matter what protocol, no matter what iPaaS, what cloud-native service, what API you're choosing to use, it all can work and talk to each other over top of an event-driven architecture. And in these cases, sometimes you're using the event-driven architecture for very low latency and you're doing direct messaging. Other times, you're really looking to get persistence and have messages stored by the event-driven architecture such that they persist beyond when a consumer goes up or down, they're waiting for that consumer when it returns. Thanks, Rob, for this overview and event-driven architecture. And as you mentioned earlier, one of the speed advantages of event-driven architecture is you can connect different microservices and applications within your system, whether you're having legacy, meaning for legacy infrastructure in your system or you have IoT devices. You have microservices that are communicating with different protocols, different APIs, so there's different languages as well that is involved in an event-driven architecture. So truly, when you think of an event-driven architecture from an enterprise level or even kind of like a real-world use cases, it is pretty complicated. There's a lot of complicated things, there's a lot of moving parts in there, which means that the underlying messaging system and the messaging infrastructure that you're using to power your event-driven architecture should take into account this variability in your system, where it should support the variability in protocols, messaging protocols, APIs, languages, and all the different kind of ways you can connect things in your event-driven architecture. So when someone needs to kind of implement and deploy an event-driven architecture, those are the factors that you're gonna take into account is how do you work with all this different variability? You did mention as well the concept of cloud and native solutions. So this is also another thing you gotta take into account is if you want to adopt an event-driven architecture strategy in your system, you gotta ask yourself questions. Like how do you make sure that my on-cloud and on-ground data communicate with each other? You have different form factors. If you have, for example, data residency requirements and the data needs to be in a particular country or region, how do you make sure that there's cross-region communication and this is where the concept of an event mesh comes in, where Rob mentioned earlier, which is a network of interconnected message brokers together. All right, so this is great. We talked a lot about event-driven architecture and open telemetry. Now let's try to kind of say, we mentioned previously the importance of local production, the importance of observability and how if you have an observability strategy in your system, then you're already ahead of the game by at least collecting as much information as you want. But now if you have an event-driven architecture and you want to deploy a distributed tracing strategy into your event-driven architecture, there's so many things that you gotta take into account and it's a pretty challenging thing to approach. So before we delve into the solutions for these things, let's first talk about why is it hard to observe an event-driven architecture? What are the challenges behind implementing and introducing observability in an event-driven architecture? Right, so the challenges around observability and event-driven architecture are really around its base nature. I mean, you're talking disparate protocols, different microservices, different cloud services, each one providing their own formats, their own ways of communicating. You're doing that in an asynchronous manner. It's going into an event mesh and you don't know how many times that message is being delivered to different consumers. You don't know exactly which broker is going to be delivering that message, particularly when you're looking at something like we were talking about going from Asia to North America, you're gonna cross a couple of brokers. You don't know if that message didn't get delivered, was it dropped by the consuming application so it actually was delivered or whether something happened in the middle of the infrastructure, say you're kind of a live expired or a number of retries went through and it was declared undeliverable. It can be very difficult to diagnose exactly what happened in that black box that is the event mesh today. And I think this is what's so exciting about taking open telemetry and combining it with messaging so that you can really understand not just the application layer, but what's happening in the messaging underneath what happened to that message that went from this application to all these other applications and even what was the latency in the timing when you went through all that? Yeah, and Rob, when I think of distributed tracing and observability and collecting it from a system, I think of it from three different levels. First, we've got the application layer where your application microservices could generate spans and tracing information from within the application when there's kind of like logic interactions. There's also on the API level. So when there's different microservices that are interacting with other different services or different microservices, different servers and on the API level, you are generating spans. And then there's the broker layer. So these are the three layers. You've got the application, API and the broker layer. So it is extremely critical to understand what is happening within the messaging piece of your system. So within the broker, things like, if the message wasn't in queue or it wasn't in queue, did the message fail within the broker or like things that you mentioned earlier from the consuming or the producing end? So it is extremely critical and valuable to generate this kind of information and these kind of spans from within the broker. All right, so we talked about kind of the importance of having observability in your EDA system. Why is it important? And I do want to give a shout out to the open telemetry community in general for all the work that they do from an open source perspective. And one of the open source initiatives that the open telemetry community has is the open telemetry collective, the hotel collective, which is something that is extremely powerful since it's backed by open source and by the community. So Rob, a question for you. If I do have a message broker and I would like to enable it for generating distributed tracing information in my system, what is the approach that I need to take considering open telemetry collector and to have this support for observability in my EDA infrastructure? Right, so that's a great question. What you need to be able to do is you need to be able to tie that messaging infrastructure in with the application infrastructure and you do that as you were saying at the API layer. So here what we do is context propagation, which is a capability that is provided by open telemetry and allows us to propagate information from when later for the next. And basically what we do is take that information like trace ID, there could be baggage and other contexts and put it into the event itself. So it's being carried along with the message as it goes through the data path and it gets modified at each span along the way that's generated. And so each broker generates spans as the message passes through it, usually multiple spans. So for example, you might have a received span that it records when the message was received and then how long it took before it was in queue and it could be in queue in many different queues within that broker. And then from that point, you could generate a send span for every single one of those messages that's in queue. And a send span will tell you when that message was sent to the consumer and what response was received, how long it took to get received that response so that you can track that delivery and understand whether it was successful or not, why it failed, if it failed and the timing resulting from all of that. And those spans are collected by the broker and packaged up into messages that get sent into the open telemetry collector that you mentioned. And in the open telemetry collector, you've got the ability to feed in open telemetry protocol but you also have the ability to embed receivers. And receiver will take a proprietary protocol and convert it into open telemetry. It does processing of that. You'll be able to do filtering if you want it to do that. And then hands it off to exporters either through OLTP or an embedded exporter for whatever observability tool the customer has chosen. And there's such a wide variety of those that the collector is really a key attribute of the distributed tracing system because it enables, it hasn't been any destination from an observability standpoint. And the open telemetry community has worked so hard in standardizing this process and making sure that there is a standard that is followed in the industry for our traces and as they go through. And the advantages for this right now is, regarding what's observability back end you have, whether it's CAG, data dog, data trace, to name just a few, and we're just scratching the surface here, is you can take all these traces that is generated by the open telemetry collector and stitch them in the way that you want within your observability back end. And that way you can have, again, a standard way to processing these traces that are generated from the open telemetry collector. All right, well, this is great from being able to do distributed tracing in an event driven architecture. So let's assume right now I am an organization and I am way ahead of the game. I know that it is important to have an observability strategy in my system. I know that I need to use an event driven architecture strategy as well for my solutions. What are the advantages of having in distributed tracing supported event driven architecture that I have? What benefits would I get from investing in enabling my EDA with distributed tracing? So there are really four major benefits. So first of all is debugging, which is kind of a core use case of tracing, which allows you to debug your application when you're building it in a dev environment and understand exactly how that message flows. Is it flowing exactly the way you design the application to and validate that? On the flip side, you can take that same capability into operations and do troubleshooting. When something's going wrong, you can look at your traces and understand, okay, this is what was going on with that application. These messages were going to where they were intended or there was something wrong there and to figure that whole thing out. Third is the ability to monitor and optimize. So this allows you to step back and look at your event mesh and understand what's happening with your event mesh, how long is it taking to get two different places in your event mesh even optimize things like your topics within your event mesh. If the same messages are on different topics but going to the same destinations, you can combine that and simplify things for example or maybe you want greater granularity and so you build out your topics that way. And last but not least, we've got data lineage which in this context really boils down to proof of delivery. So can you tell the application developer what really happened with that message and guarantee that a particular message got all the way to the end? What happened to an individual message? It's really proof of innocence and how fast can you demonstrate to the application team that the event mesh is working right or get to an understanding of what really went wrong and how to fix that? This all is great inputs. Thanks Rob for this. And what I really think would make this come to life even more and have it more exciting is a demo. So another thing that I really like about the open telemetry community is apart from standardizing things working so hard from an open source perspective, they have developed this hotel demo which is a very interesting and neat demo that had pretty sophisticated moving parts in there and you can check it out by going to the open telemetries website and checking out the demo. This demo is basically an e-commerce application that has different moving parts between the front end, back end, accounting services, fraud detection, all the different kind of moving parts within an e-commerce system to kind of reflect what a real use case of an application could be. And they added in top of that distributed tracing through all the different moving parts so you can see the context being propagated of a message or an event that is happening in your system throughout the different moving parts. What has been added as well to this demo is a messaging piece and that messaging piece is being demonstrated by having a Kafka broker in the mix. What we did here at Solace is we've taken that demo, it is open source on a repo, we've worked it and we've added a Solace component to it. So now you can run Kafka and Solace hand in hand doing the same activity and through this demo you can publish messages to your Kafka broker and to your Solace broker and you can see all the extra details that we were talking about, what happens within the message broker. So instead of seeing the broker as a black box in your system of a message going in and a message going out, you can even delve more and do like surgery within your message broker and kind of see what queue did the message go to? What happens within the broker? Did it send it to the final kind of consumer or not? So it really adds more light and color to your EDA component of your overall system. So stay tuned, I will show you where you can take a look at this demo so you can run it on your own local machines and play around with event driven architecture and distributed tracing. So that's what we have for you folks for today. We hope this was pretty informative. We did talk about open telemetry, distributed tracing, event driven architecture and how we can implement distributed tracing within event driven architecture. And lastly, the demo piece that is available to bring all this tonight. So thank you again for listening, Rob. Thank you very much for all the content and information that you provided. I think that was pretty informative and interesting. Thank you to Mimi and thank you for all your work on the demo from a Solace endpoint. They're very valuable, looking forward to seeing how we can continue to evolve all that to so many different components and protocols. It really fits very well with event driven architecture. I'm sure this is the first step. Yeah, just scratching the surface. Fantastic, thank you.