 Hi, everyone. My name is Libby Merrin. I'm going to talk to you today about something that is not about open telemetry or open observability technologies and architectures. There's been some great talks that have already covered a number of really cool things. But what I'm going to talk to you about is that everyone here knows observability is important. But how do you actually convince your organizations to invest in doing observability or doing more observability? Especially before there's something catastrophic that happens and you have to react. So a little bit about me. I've been doing developer relations for a few years now. I'm part of the open telemetry communication sig and also the end user sig. And my first contribution to open source was open SSL many, many years ago. Okay. Opening the door to observability in 3.5 steps. So observability provides the knowledge and confidence so that different parts of the business can move faster. I think everyone here is familiar with the benefits for your software, your infrastructure, your products as you build and deploy them. You can handle planned and unplanned outages. You can determine cause and effect and so on and so forth. But observability also has benefits for different parts of the organization. So your marketing and sales teams, it can help them to meet their lead generation goals, develop customer prospects pipeline and understand at risk customers. It can help them not only land customers but also expand your product footprint with them and grow revenue. It gives you a better understanding of your current and target customers so that marketing and sales know where to spend their dollars and effort. For example, is it better at this time to spend more time on social media or actually buy ads? And it also helps focus the sales team. Leadership also gets many benefits from observability, whether it's your executives through to your team leaders. It helps them determine the ROI of investment and be able to communicate that when looking for investments, external to the company. It enables product decisions on the customers and market and can take into account your competitive data. It helps justify decisions to whoever the company reports to, whether it's a board, Wall Street or investors. And instead of having to wait for reports to be generated based on known questions, which of course is limiting, it enables asking at the time, regardless of what comes up, questions like, if our competitors are doing X, if we did it, what would the impact be? What would it take based on this amount of dollars or refocusing these resources? And it also enables things, determining things like what would make the team more efficient or receiving our accounts from our customers. It also affects IT, which seems somewhat obvious, but not only that, security, legal and compliance aspects of the business. You no longer need to do after-the-fact auditing, but you can also determine holes and risks at any point in time and adjust your processes and tools effectively. It also means that when laws change, whether you're going into a new country, a new market or, hey, laws change, you can address them proactively and more easily. It addresses tool sprawl, which is an ongoing issue for all of these departments. And it also enables responding to incidents more quickly, different types of incidents to your security infrastructure, your software and infrastructure ones, and perhaps prevent them. Okay, so step one, you need to explain observability. Now, we all know that observability is not just some sort of alert telling us or metric telling us that there's a problem. It's actually bigger than that. But depending on who your audience is, you need to help them understand what observability is. And speaking of what observability is, is it those three signals, metrics, traces and logs, allowing you to get contextual answers to your questions, regardless of what the question is? Is it the ability to understand the internal state of a system based on the outputs, which is the control theory definition I think everyone's familiar with, or, and I think this is my preferred definition, is it a measure of how well you can understand and explain any state your system gets into, no matter how unique or bizarre, without having to ship no new code. And that definition comes from Charity Majors, Liz Fong-Jones and George Miranda. Feel free to copy it. You need to make your communication of what observability is appropriate to your audience. How you're going to talk to your CTO or an engineering manager. They likely get what observability is and they want to go quite a few levels down. Your finance person likely doesn't need that. And it's really important to underline that observability is a practice, like security, reliability, et cetera. It cannot be bolted on. Okay, step two. You need to describe what the impact is of not having observability. At its very heart, you're being reactive. If you think about the pandemic and how organizations decided to be lean and mean, live life on the edge and not a mass surplus shipping containers. And then the resulting effect when the pandemic happened and we've now got a supply chain shortage that has continued on for over a year now with no end in sight. But even with having to be more reactive, it's more dangerous than that. Your customers' issues affecting customer experience reduce their trust in the product and in the business. In fact, four out of five customers in a recent study would leave a brand after three or fewer poor instances of poor customer experience. The remaining one in five would leave after one poor experience. You may not be operating at the level of an Amazon with that level of infrastructure, but Amazon found, and this is going back 10 years, I can't imagine what it is now, that with every 100 milliseconds of latency, they lost 1% of their sales revenue. You also have difficulty understanding the customer's experience, and you have to be reactive to problems, which could mean customers find bugs before you do. That's not a situation anyone wants to be in. And because you don't have a contextual cross-system view, you struggle to find root causes and end up with a bunch of unexplained system behavior and your misses. Okay, engineers. And I know all of you know this, but this might be a way to communicate it to whoever is going to hold the decision to sign off on whether to invest in observability or not. You know that gone are the days of using a debugger to walk through your code and understand the issues with the move to modern technologies, cloud-native, distributed systems, microservices, etc. That logical flow is just not possible. Engineers today deal with more and more tools. In fact, it's on average an engineer deals with 15 different tools in order to check in a change to the code. And when you consider that CI, CD pipelines are usually cobbled together of a number of different tools, all of which have their own way of reporting issues, often without context. It just becomes incredibly complex to measure and understand and fix issues, and you'll only get ahead of them. And all of that means that your engineers, engineers by which I mean developers, SREs, DevOps, insert the relevant title here, they're spending time on reacting to issues instead of what their day job is. Those repeated issues take time away from building new features or products, and with complex systems the likelihood of an issue requiring time from multiple teams increases. You've got the people who wrote the application or feature, your DevOps team, the team responsible for infrastructure, site reliability engineers, security engineers, your support team, and so on and so forth, which not only makes the resolution time longer, it also makes that sunk cost of all of those employees time, which takes them away from focusing on efforts that will achieve the business goals and likely grow revenue. Alert fatigue is real, as we all know. It's not only the loss of, the potential loss of employees leaving the team or the organization, but there's a dollar impact of losing an employee. It costs around 94 days to find and fill the gap, and I would argue that that's quite a generous estimate. It probably takes more time, particularly as you get into more senior and experienced roles. Plus of course the team time spent on interviewing and finding those resources, those new highs. It's estimated that the cost to replace or hire someone is about 50 to 80% of the employee's salary, with the overall cost ranging from 90 to 200%. And I have the references in the slide deck, which I'll make available on Sketch. Impacts on business. Your fragile system leads to slow feature and product delivery, which reduces the ability to grow the business or meet revenue goals. Your ability to understand business impacting issues, such as how many customers are using Feature X? Are you billing them correctly based on usage? I think many people here would remember the Atlassian service outage earlier this year that lasted two weeks. It was determined to be the running of a script with the wrong execution mode and wrong IDs, which resulted in sites deleted for about 400 customers. That not only had the effect of unhappy customers, you may know or be part of one of these customers, but also the time the majority of the business had to spend on those two weeks dealing with the issues. And that goes right across the business from customer support through sales reps and of course engineering. And it's really important to connect those impacts to your organization's goals. And you know your organization better than anyone can. Step three is building your case and you need a plan. If I was working, I don't think we're going to get Wi-Fi to work. Well, that was a really cute little video of South Park and the underpants gnomes. If anyone's familiar with that, the underpants gnomes had a plan for profit. Step one, collect underpants. Step two, step three, profit, which means that of course you need a real plan. So first part of your plan, figure out what your objective is. It could be one of these relatively common ones, like finding out about system issues from the software, not from your customers or employees, reducing outages and service disruptions by some percentage or number, resolving incidents faster, being able to understand the unknown unknowns, managing tool sprawl to save money and reduce inefficiency or increasing developers' ability to innovate, resulting in an increase in speed and efficiency. It's really important, of course, to make your objective measurable. And it needs to be tied to your organization's priorities. For example, if your organization's trying to launch in a new market or region, being able to observe the system behavior and the customer response to it enables you to do things like see the impact of your go-to-market strategy and change it in real-time, which has measurable and immeasurable savings, I'd argue. Okay, step 3.2, what are your options? Let's talk a little bit about platform selection here. I can't recommend specific products or platforms. It depends on your objective, your infrastructure, how your teams operate, and many other details unique to your organization. But here are some generally applicable considerations to bear in mind. Consider open source. I mean, we're all at an open conference. I think we all appreciate open source. Of course, it's worth reiterating that open source is free like a puppy, not free like beer. It has its costs. It does enable getting locked into vendors. It's tough to find a single observability platform that meets all needs. And no shade on any of the vendors represented here today. It's the fact that our world is changing so fast, technology is changing so fast that it's impossible for all of the requirements in the future to be known today in time to build for them. It's important to play a catch-up game. Many vendors integrate with open source products such as Prometheus. Using open standards, well, we've heard a lot about open telemetry today and vendors adopting them. Using open standards, of course, helps with future-proofing, particularly if you're building your own platform. And it aims in helping your platform component to be pluggable. So if something's not working for you, costing you too much, insert the reason that's applicable, you can, of course, swap it out. It also reduces the need for your operations and, in some cases, your developers' employee needing to specialise. Meaning that those employees need to get trained up. If you lose them, you have to find replacements. And in the meantime, you've lost that knowledge. Many of these considerations come from real-world learnings, such as the company, and I'm not going to name names, who invested 14 months in a vendor's platform, only to realise it was costing them so much in maintenance, services, dollars and engineer time that they then had to reinvest in moving to a hybrid solution, which took them another 12 months. I think there's a good reason for Gartner's prediction that by 2025, 70% of cloud-native applications will be monitored with open-source instrumentation. As you go about this, I recommend carrying out a brainstorming exercise. I'm sure as you're thinking about the different options for implementing observability at your organisation, you have a number in mind, but it's worth taking the time to try and hate this phrase. Think outside the box. If you're able to just brainstorm, going for any crazy ideas, you may find ideas that you hadn't thought of or options that may work better than the ones you'd thought about. Once you've figured out all your options, prioritise them and then choose maximum of three and list out the pros and cons of each. That's going to be important as you're communicating it to whoever your stakeholders are. Okay, step 3.3. Address concerns upfront. So what does that mean? Well, some common concerns that your stakeholders are going to have is will this impact any of our business objectives? What are the different options? Why have you chosen the recommendation that you have? What will the impact be on our customers or users? And who else is on board with this idea? These are things you need to think about as you build your plan because stakeholders are going to have concerns showing that you've thought about them and have an answer. Even if the answer is I don't know, but here's my plan to get back to you with the answer helps build confidence in your plan and you as a result. Your business objectives could be something like launch a new product or achieve X dollars in revenue. Choose the one that's most important to your audience if you can, but more importantly, make sure that you've connected the impact of your plan directly to the objective and of course in a way that's measurable and demonstrable. Options, listing those options and clearly articulating the pros and cons shows that you've done your homework. And while it's important to know who else is on board, the way you do that is know your organization's influences. Who are those people? And make sure you talk to them before you go to your stakeholders with your plan. Get their feedback and of course incorporate it. A positive word from an influencer for whoever your stakeholder is can help sway their decision in favor of your case. Unfortunately, not all decisions rely on the logic of what's being presented. Step, I think that's meant to be 3.4, build the plan. So, thinking about cultural shift, we know that observability takes a culture, it's a practice, not something that's bolted on. Use the premise that all issues will be novel, especially in a cloud native or distributed system and especially as your company or their products or the infrastructure grow and change. How will this affect your development teams, for example? Will it be additional work? Will they need to shift to incorporate new patterns such as control engineering and business goals? Will there be a short or mid-term cost for a long-term return on the investments? As you think about execution, will you start with a single app or a single team? Will you prototype? Will you start with open source and then or try and compare a couple of the commercial offerings? Will you take an inspect and adapt approach? Cost is important. You need to know what you're asking for and you need to make sure it's clear to your stakeholders what you need from them. And again, at least for what you know you know and call out what you don't. Again, have a plan to come back with the answer. What you don't know maybe, for example, how much this solution is going to cost us because those things can be hard to find out unless you have control over budget. Sales reps don't like telling you otherwise. But it's fair to say, hey, I don't know, but I'm having this meeting with so-and-so in two weeks' time and I should be able to come back to you with the answer then. Yes, it's important to have success criteria, but it's not enough just to have them and measure them. You need to report on your progress and you need to report often. As you're communicating your plan, know that it's better to address issues as they come up and missed objectives, along with your plan to address them and get things back on track than to keep quiet in the hope that you can fix them. Being transparent and honest is the way to build stakeholder trust, which is important. It's their trust in you and their willingness to continue to support the program, assuming that they've greenlit your plan. Learn from other organisations how they've addressed observability, the challenges they've faced, any pitfalls, and, of course, as I keep saying, customers to your organisation's needs and your specific objective. There are many excellent resources out there. You've heard some today on the benefits of observability. Use them. And this, of course, includes people sitting next to you or in front of you or behind you in the room today, but also on other communities such as the CNCF Slack. There's no need to reinvent the wheel if you can learn from what's been built already. Okay, that was 3.4. Step 3.5, getting your buy-in. You know your audience and your organisation best. You may need a write-up. You may need a presentation to execs, standing at a podium or a white paper. But whatever it is, whatever the format, these are five things that need to be part of it. And in this order, describe the opportunity and the objective, describe the current state, the options and the recommendation, show your work, basically. Not only what you've looked at, but why you've selected the one you have. Be clear about what you want from your audience. Are you asking for dollars, approval to go ahead, approval for a team to spend X% of their time away from what they're currently working on, but working on your observability plan. If you can't weave your concerns into your plan as a whole, make sure that you've listed them in some way that they can reference later, such as in end-of-pendix. And not just the concerns, but also how you'll address them. Then you outline the plan, the risks and mitigations, the cost and success criteria. Make sure with all of this you're being really clear and concise. Leaders don't have a lot of time to read through things. There's a reason that we call reports, the start of the report, the executive summary. If you have a lot of data, as mentioned, keep it to the appendix or as a reference, and then wrap up with reiterating the opportunity and its benefit to the organization. Basically, you're making the opportunity and benefits the bread in your business case sandwich. Okay. Putting it all together. One, make sure your audience understands what observability is and what the benefits are across the business. Two, lacking observability has very real effects. Make sure to tie them to your organization's goals. Three, build your case by one, defining the objective. Two, investigating the different options. Three, dealing with possible concerns. Four, building your plan. And five, getting by. Thank you. Just before I finally wrap up, one more plug for Open Telemetry Community Day tomorrow or Hotel Unplugged. You can please sign up and tell your friends that tiny URL there. It will be as we're aiming to make it as good an experience for people attending virtually as those attending in person. It's free to attend virtually and there's a nominal fee for in person. And if that fee is a problem, reach out. We've got scholarships available. Thank you.