 Hi everyone. This is Alulita Sharma. I am one of the co-chairs for the technical advisory group for observability at the CNCF. And I am happy to present this session on giving you an update of what we have been doing in the technical advisory group, short also known as TAG. And along with Matt Young, who is my fellow co-chair and of course the third co-chair being Richard Hartman, who is unable to join us today. I'd like to first of all do introductions and a very quick introduction for Matt Young, who is my fellow co-chair. Matt is a principal cloud architect at Evercoat and is deep in the midst of architecting, infrastructure, services, developer-focused tooling for all things cloud at Evercoat. He has had an extensive amount of experience and specializes in research as well as development of embedded systems, virtualization, distributed applications, and for his architecture, a lot of different areas. And Matt also has been an awesome co-chair for the CNCF SIG observability and of course in his spare time, the copious spare time finds some zen in his family motorcycles and tennis matches. That's it. Again, Matt, over to you and let's walk through the rest. Certainly. Thank you, Allelita. So Allelita joins us, hello everyone. Allelita joins us recently and is our newest co-chair. We are thrilled to have her fill our third co-chair spot. She brings with her a wealth of industry experience and background, not only in the technical aspects of observability tooling, but also perhaps more importantly for the tag, or as importantly rather, experience and a demonstrated track record of building open source communities and fostering them and shepherding them both from a technical but also from an interpersonal and organizational perspective. She is a member of the Open Telemetry Governance Committee. She's a board. She's on the board of directors of the Unicode Consortium. She has served on the boards of OSI and SFLC. In addition, she's led engineering teams at places you may have heard of before, such as Wikipedia, Twitter, PayPal and IBM. Most recently, she is heading up observability and open telemetry work at AWS. So we welcome her and we're really looking forward to what comes next. Before we start, I really would like to call out some words that speak to us from the 70s and 80s. Donella Meadows was a researcher that was the first person to tell us that we as a species are going to run out of wood and oil. She was part of the Systems Research Group out of MIT with Mr. Forrester. And posthumously, a book was published a few years ago called Thinking in Systems, and it summarizes almost 30 years of her work. It's relevant in particular for observability because that work that she contributed to, and in many cases spearheaded, gave us concepts like reinforcing feedback loops and stocks and flows. Basically, the control theory that was the precursor to the declarative state and its controller reconciler pattern that we find in Kubernetes and operators and what has really driven a lot of the complexity that observability tooling has grown in the open source space to me. One of her quotes that I find the most encouraging and relevant here is, in spite of what you majored in or what the textbooks say, or what you think you're an expert at, follow a system wherever it leads. It will be sure to lead across disciplinary lines. Later on, we'll be talking about some of the futures and the potential that we have in the tag. We welcome and desperately need folks of all disciplines, program managers, product managers, you know, people who enjoy writing, video editing, all manner of disciplines, not just engineering, specifically not just engineering. With that, what is this observability stuff? What is it? Is it this? Is it just lots of screens so that everywhere you look, you don't need to move a muscle just eyeballs and you can see all the things, maybe. To set some context though, think back. Not so long ago, in the beginning, we had monoliths and they were great, they weren't broken, they just worked. And then we had to go and break it all, or supercharge it, depending on your point of view. But in either regard the complexity that microservices and everything that's come after it, from server sediments to event streaming to big data and the lot, all of it has dramatically increased the complexity of observability tooling. We've created wonderful complex monsters and we need wonderfully monstrous tools to wrangle what's happening as we attempt to understand and comprehend the complicated interactions and the systemic behavior of the cloud native systems that we're so actively engaged in creating. So, you know, there's a lot of ways to define observability. It has become almost a buzzword, but to me, monitoring tells you whether the system works. Observability lets you ask why it's not working. And in that nuance, I think we find space and opportunity for a tremendous amount of innovation moving forward. So, without taking away Alulita. All right, so I think you know that's very good segue into what, you know, we are looking at in terms of what observability does for you right and observability fundamentally is the, you know, in the court, we looked at right before. It gives you as a user, the ability to understand how your system is functioning right and and and if there is a change in behavior of the system, the health of the system. It's important to first of all collect the data and understand that data to be able to process and analyze that data and then be able to actually understand behavioral patterns that tell you what is wrong. And then you remediated and hopefully in the long run, you know, moved towards self healing systems where the systems are smart enough to be able to, you know, prescribe and adopt fixes to remediate a particular change in behavior, right and and have continually running optimal systems. With that said, again, the types of data that we are looking at in observability today are particularly focused on three data signals right their metrics which are aggregatable. You could be picking up Delta metrics or cumulative aggregations and being able to collect that from all kinds of data sources to better understand the you know what the behavior of a particular pulse on the system is at any particular point in time, you have tracing, which is looking at traces you know through an application. And how in a span of time, I am, you know, an application is that snapshot of data, and then you have logging right with logs, which collect data, you know, constantly near records of snapshots of you know data being collected for particular actions happening for a particular data source and then being able to actually collect that and then transmitted to a monitoring system where you can look at the behavior analyze it, you know, apply sophisticated data patterns and be able to analyze and assess the behavior further. Can you move to the next slide. So, as you, as you see, you know, there is fundamental data signals of metrics tracing and logging that is already being, you know, used in the monitoring systems that we have especially for cloud native platforms, but also for cloud native applications and as we move into, you know, being making the full stack of an application observable all the way from the bare metal to the Linux to the kernel to the operating system kernel to the networking stack to the middle there to the application frameworks to all the way to the application, you have continuous profiling data, for example, coming in and intersecting as, you know, traces and metrics, you have crash dumps coming in, you know, real time from different systems, you may also have other types of signals coming in, and events, right, which could be construed as logs or could be as events in themselves, you know, with some metrics wrapped in. Next slide, please. So, that said, you know, the this is kind of a moving target right now because there's a lot of data that is trillions and trillions of petabytes of data being collected from systems that we are constantly monitoring. And as we, you know, move towards more sophisticated observability, testing, fuzzing chaos testing are all areas that will intersect with you know how we are making systems more observable and building end to end pipelines for being able to handle that. So, with that said, again, I would like to kind of just show you, you know, some of the complexity that we have had in this space that we are looking at in observability. As Cindy Streeter and says, while plainly having access to logs, metrics and traces doesn't necessarily make systems more observable. These are powerful tools if understood well, unlock the ability to build better systems. And as you can see, you know, that matters because it optimizes our pre production pipelines, you know, there's the types of tests listed here unit test functional tests, performance test, you know, fuzzy test a fuzz test, all kinds of testing, as well as threat modeling security testing, you know, all are affected by the ability for us to actually have more observable data and self correct, you know, behavior of systems if they're out of out of work. Similarly for deployment. You know, in deployment environments having integration test load tests, shadowing soak tests for releasing where you have canary, you know, canaries, traffic shaping exception tracking, or even for post release where you have you know logs events chaos testing ab testing and user monitoring events coming in auditing, you know, and even on call experience all the way from your customer. And each one of these areas has data that is collected which you know as we look at the system end to end holistically gives a very good, you know understanding of the current health of the system current status of the system and leads to far more end to end, you know, complete well adapted user experience for end users. So with that said, I'll switch over to Matt, who will talk more about why observability matters. Thank you, so I'll take this opportunity to say that in the tag today in the in the group that we formed over the last little over a year now. We have tremendous representation from the vendor community, as well as the project community, you know, projects primarily focused on observability tooling. We'll talk a little later. But, you know, the tag is a place where not only vendors and not only projects that are making observability related tooling can come together but also end users that are using these techno these technological building blocks and the capabilities that they provide to deliver value to their businesses. And so this is actually a slide from a few years ago when we set out to build out an open source observability platform. Not because it was open source, but because it solved business challenges in a way that was, you know, forward looking what we need, and that we could engage with and contribute to and mutate and expand. And so, some of the values that we that we took on and I would, I think that many others in similar situations have some of these same values. It's worth understanding the why, you know, why are we engaged in this, the technical pursuits are a delight of their own. Right. And we can look at this as engineers and say what are all the ways that what are all the capabilities we can do what are the cool things that we can we can accomplish. But people trying to use this stuff in the real world have perhaps a different view of what's valuable. Sometimes things that are technically nuanced and interesting might be too complicated to understand, or not pragmatic to deploy. You know, some of those same capabilities, you know, that Alelita covered translate directly here but they come up as how do we handle incidents, you know, when, when we're being attacked by hackers, or when we're being attacked by our own bugs, because our services are face planting and, and that hadn't been accounted for. What do we do, you know, how do we respond. What's the, what's the workflow and how can these tools inform that. How do we secure our intellectual property, but also more importantly how do we secure the data that our customers entrust to us, you know, we're the stewards of that data, and as more and more businesses go online as verticals, such as health insurance and medical and biology, you know, all of these, all of these fields have customer information and personal information that might even be not only sensitive from a privacy perspective but in addition it medically protected in a regulatory way. So how do we do that. How do we observe these systems while protecting the integrity and the privacy of our customers. On the operational side, how do we do auto scaling, how do we detect anomalies, you know, how do we, how do we know what things are costing and how we might be able to optimize them from a values perspective and, you know, observability tooling that helps us achieve these business goals, you know, to be fiscally responsible to in a perfect world not have to scale infrastructure costs with with traffic costs for example. And all manner of business goals, you know, observability tooling the ability to see what's happening and why and how these things relate to each other. The value they provide is vast. And there's again a huge opportunity to innovate in the space and to build a top and with what's already been put in place over the last few years in this space in the which is in its own right substantial. So it's an exciting time to be in this space. And we welcome all no coupon talk is complete without seeing this chart at least once. If we focus in on just the observability section of the landscape, you know, it's grown dramatically and continues to. There are so many companies and and projects that are thriving in the space and part of the tags function is to work with them. So I've said tag a bunch of times. I should really define what it is. This diagram is courtesy of Chris our CTO CNCF and the Linux Foundation in a blog post he made announcing new to see members earlier this year. In the purple box here special interest groups is what we used to be called. So interestingly, just over a year ago, SIG observability was formed. And just a less all within the same year, SIGs were renamed to tags tags stands for technical advisory group. And tags exist to inform the TOC on gaps in the ecosystem, as well as to foster those same ecosystems by, you know, in a number of ways, consistent with their charter. A little over a year ago, members of the community came together and created the charter for tag observability. It was a consensus document contributed to by over 20 people from a number of CNCF members. I've highlighted some some excerpts that are really relevant now as we look forward to what what comes for the next for the rest of this year and into 2022. And the top level item really is to foster and sustain the observability ecosystem. Again, that is inclusive of end user community members vendors projects in the observability space but also projects generally within the umbrella of the CNCF. Nearly any project has observability goals and concerns. And so, you know, the tag is a place where these these these groups can come together in productive discussion. We exist to identify gaps in the CNCS portfolio of projects. For example, the octant project out of VMware came to present to our tag a few months ago, they're an Apache to project. They are not part of the CNCF yet they are in scope because they are an open source observability tool. Similarly, end user community members have written their own tools command line tools, and other sorts of solutions of all types that that may one day be in the CNCF umbrella of projects but are not yet but our open source. So they are in scope, and they are part of our community. We exist to curate and disseminate patterns and best practices. Speaking from a personal experience, I've benefited greatly by being able to collaborate with others in the space as as at my company. We have been deploying an open source observability stack with some success. Provide users with unbiased information. Again, as all of these. Vendors come together under a shared banner. You know, we can produce guidance and curate information that is not architecture that is not a pitch, but it's just, you know, there to provide information that can be trusted. It's a vendor neutral setting for thought validation. You know, many of the open standards that we have exist, primarily because vendors have decided that interoperability is paramount. So I will talk about this more but the tag exists specifically to provide that forum to have those discussions in a structured and safe way. So over the last year, the tag has participated in due diligence assessments and reports as projects seek to move from the sandbox to incubation. I need to be clear that the tag is not an arbiter. It's not a decider. Again, it's a it's a place where domain experts congregate and are committed to providing the technical oversight committee unbiased, usable, actionable information as as part of due diligence. So that end users can be assured that, you know, incubation means a certain level of assessment and and foundational homework has been done, the due diligence reports that have come out of these efforts and and all of the other due diligence reports across the umbrella of CNCF projects are found as an end user immensely useful. I encourage you all to check them out. If you're trying to figure out, is this project a good fit or how is it a good fit. These are a wealth of information that is provided in the open. So we're proud to have contributed to a number of those due diligence reports. In addition, we've had guests from from from open source projects that are entering the CNCF in the sandbox phase. Pixie is an example of that. New Relic has recently open sourced Pixie and rather has donated Pixie to the CNCF and they're entering. They came in and had a chance. So the tag really is a place where it's a little bit of what's new, what's current, what's challenging, what's pressing, all of those, all of those topics might might might be found at one of our meetings. Unless we kicked off a working group in end user driven observability white paper, that's in progress and will be out for community review in the coming months. So that that working group has has launched successfully, and we're seeing some great collaboration between our end user community members. We'll talk about what's coming next. Yeah, absolutely. And thanks, Matt. We are in exciting times and observability as you know are the open source projects that are all under the CNCF umbrella, whether that's Thanos or Cortex or open telemetry and many other projects like pixie. All working towards specific goals that they have and you know, working towards what's the coolest and newest feature in their in their repertoire. So what we would like to do and this is something again I invite all of you to kind of come and think about and discuss with us is that provide more guidance of what's coming in in the ecosystem obviously as subject matter experts, many of us track what's coming in and very actively work and what's coming up and what's being built in the observability tool sets and tool kits, and as well as build that out, you know to be make available as webinars being able to make available as you know YouTube talks in the CNCF tag channel, and we can leverage a lot of provide case studies from end users as well as you know different pipelines that end users and developers are using for these observability components in the in our open source projects, as well as have project demos for the year, you know, in the umbrella projects that are within the CNCF umbrella as well as outside cross collaboration is something very important so we do think that, you know, being able to demo and learn from each other is a great way to actually build interoperability and open standards towards making it easier for customers to use observability components out of the box. Another area we're looking at is work work group topics, such as you know continuing to have cross project engagement where we are discussing as I said you know collaboration as well as open standards interoperability test suites and other ways to collaborate closely together. As well as you know other projects such as gathering feedback from end users for persona development, which is a joint project we've been working with other sister projects on and needless to say you know there are multiple many many many companies that are you know participate in our discussions and have discussed over time. We have tagged participants from multiple companies these are just few of them listed here thanks to everyone you know for participating and continuing to participate. And I would also like to actually last but not least welcome all of you to join us for these discussions, you know, a advisory group meeting or a user group meeting is as good as the participation that we have in them from everyone and all of us and all of you. And just to call out we have meetings on the first and third Tuesdays of the month, where their observability meeting notes that we maintain and you can go and look at them anytime catch up on what was discussed but these meetings also recorded so you can catch up on our YouTube channel. The GitHub repository where you can track all the you know active work that's ongoing whether those are evaluations or any other comments that people you know are adding our feedback. There is a slack channel to tags dash observability that you can join and you know meet other fellow observability members and experts and as well as a mailing list where there's a fair bit of activity for example around the white paper and other work groups that are ongoing. We'd of course love to have more work groups and so and and really welcome your feedback. Welcome you to come and help and join in to discuss. If you have a passion for developer tools and experience you don't want to share your experience. This is a great place to do that. The idea is for you know improving observability in general in terms of the open source projects that we have and the different discussions that are ongoing. So that's engineering discussions technical discussions on implementation that's design practices, good practices on, you know how end users can build consume instrumentation well, or can be making configurations easier there's a lot of topics in this space, and as well as you know solutions that work for you, and tools that you're building. So that said I would love to invite you again, you know to please join in. I would also like to last but not least call out and acknowledge all the folks who have actually participated very actively in our meetings. Again, you know this is quite a large list and really it's been exciting through the whole year we're looking forward to having lots more participation and lots more cool areas that we work in and observability in this coming year. So again, would love to invite everybody to join in for the, for the tag meetings that are regularly held as well as follow us on the channel. Slack channel and look forward to having and seeing you on on more projects and observability at the tag meetings. I'm running off now. I'll leave this is a Lisa Sharma and Matt here Matt young. As as chairs see you there.