 Anyway, thanks for coming. My name is Eric Peterson. You might know me as I-M-E-A-P on GitHub and all places around the web. Metrics have kind of come up as side topics and a lot of the conversations that we've had so far on stage today. And I kind of wanted to center it, focus it a little bit, and dive a little bit deeper. So why should you trust me? I'm just a software engineer. But before I worked at Spotify, I worked at a company called Tableau. It's like a data visualization analytics company for about eight years. I wouldn't call me a city-slicken data scientist or anything like that. But I'm at least data literate. I worked at the company for a while. Naturally, I've been involved at backstage for about two years at Spotify. You might have seen some commits from me in the tech docs and search areas as well as the analytics API itself. And a little fun fact about me that will become relevant later in the talk. Despite my rather Scandinavian name, I am not Swedish. But I do live in the middle of nowhere, Sweden. You can see there my lovely workplace where I look out across the water and the trees and the nature. It's quite nice. But as a result of living in Sweden, I care a lot about language. I learn it. I study it. And I help teach it sometimes as well. We'll get to that later. What we're going to be talking about, one of the things that I learned working at Tableau is that good analysis, good analytics, is indistinguishable from storytelling. And so I want to anchor the talk today on a little bit of a story that might feel familiar to a lot of those software engineers out there. Then we're going to sort of dissect that story through a data lens and see if there's anything to be learned. And then finally, we're going to get practical. What does that actually mean? What can we do? Let's look at some code. So to begin, a story that should hopefully feel familiar. Let's consider Bowie, the backstage beaver. It's Monday. He's getting out of bed at maybe 10 o'clock, 10.30 AM, something like that. I don't know when you get up, but he's a developer. He opened a PR on Friday. He got some feedback from his colleagues, good comments. So he gets right to work coding, making those commits, pushing them up. It's a good day. Once that commit goes up to whatever VCS system he has, the builds get triggered in the background. Eventually, maybe like 30 minutes later, maybe he's working on a TypeScript monolith with just a sprawl of tests that are running. It takes like 30 minutes to get feedback on a build. And it comes back, and it's a fail. So he goes in, looks at his logs, the build logs, and he says, oh, well, it's a test that's kind of failing. But it's in code that you didn't really touch. It seems kind of flaky. So he just retriggers the build. He's still got feedback going on the PR. So he keeps making commits, pushing them up. He's not too worried about it. Next build's green, cool, getting more reviews. It's the day of a developer, day in the life of a developer. Now, a little bit later in the day, maybe it's after lunch. All of the feedback from his peers comes in green. Give it a thumbs up. Maybe he's commented out that flaky test. It keeps kind of failing and passing and failing and passing, you know, a little shortcut so you can ship it to production. It's time to ship it, so you ship it, emerge it, master build starts, and you're good to go. Now, it's been a long day. And maybe you go out into the forest in the woods in Sweden, and you get a page in your pocket, your own call. But you're good, you have your laptop with you, you got your mobile broadband. So you open up the laptop, but it looks at that email, that page acknowledges what's happening. It starts diving into graphs and logs and things like that. It starts looking through the error rate threshold. It's kind of high. OK, let's look at the logs. Oh, there's kind of like an error that's going on. That seems related to that flaky test that maybe you commented out. That's not looking good. Well, OK, let's see. Let's go to the deployment log. OK, what other deploys have gone on today? Oh, shoot. It's just mine. Whoops. Well, you know, it's late in the day. You're probably not going to be able to get to the bottom of the actual error. So why not just go and pin that deploy that it happened just prior to what you deployed and call it a day? You'll get to the bottom of it the next day. For now, you're in the woods, you're enjoying it, it's time to party, right? Feels familiar, right? This is a thing that people go through on a fairly regular basis. Let's think about it through the lens of data. These days, you might see a platform team that goes in to all of the systems that a developer might be interacting with. I really like Teres's slide earlier in the day, where he showed all of the cloud-native tools that are available for CI CD and observability, a lot of SaaS providers out there, like your GitHub's or your CircleCI's and things like that. You might have a platform team that goes out and starts collecting metadata from all of these different services. You might have a warehouse that contains every single commit that's made by every developer and every repository. When did the commit happen? Every build, when did the build happen? How long did it take? You might have a service that has all of the alerts that have been triggered and sent to the responsible engineer on call. When did the status change? How long did it take to go from acknowledged to cleared, that kind of thing? You might have every single deployment that's gone out in your Kubernetes cluster. How long did it take? What was the status of the deployment? Et cetera, et cetera, et cetera. Now, this can be good, right? If you're focused on a particular problem in one of those areas, maybe you want to measure deployment velocity, it can be really, really useful to have that information. But it's also quite a bit of effort to actually make it happen. You got to go out to every one of those SaaS providers, connect with their APIs, warehouse that data, transform it, visualize it. It's not an easy lift, I guess one could say. And what's worse is that when you have that kind of really low-level-mated data, like every single commit, it's really easy to start focusing on, okay, well, my top developers do 10 commits a day or something like that. And it's easy to get into situations where you're setting up commit quotas and you're managing to that, you're incentivizing, well, we need 10 commits a day from every developer, and that's the KPI that we care about. Kind of the worst thing about it, I think, in my opinion, is that you just end up with all of these silos of data that are actually really difficult to connect to one another. You don't really have a sense, there's no way to retell that story that we just walked through. You can look at the silo itself, but it's hard to look across those things. Now, I don't know if you remember, but we're at this thing called BackstageCon, and there's a tool called Backstage, and there's some really interesting advantages by having such a tool. If you think about deploys that are happening, builds that are happening, when Bowie is starting his day, diving into a PR or deploying something, where is he doing that? He's probably doing it in Backstage. When Bowie got that alert, got his page, and he started to dive in looking at logs and stuff, we're gonna do that. He probably did that in Backstage. And then at the end of Bowie's journey, when he was pinning that deployment to the last known healthy version, where was he doing that? In all likelihood, he was doing it in Backstage. So we have this unique opportunity in front of us that we've never had before, which is that the entire developer experience is done in one place, or at least the high level points in that journey. Let that sink in, I guess. This is sort of new. That's all well and good, but we're not quite there yet. What we need to do is sort of translate each of those key events that Bowie went through into data that we can then analyze later. And so what I'm gonna do now, being the sort of lingua file that I am, is to help you learn the grammar of how to tell a story in Backstage through APIs and JSON and things like that. Buckle up. At a high level, there's this thing called the Analytics API in Backstage. It's event-based, like a lot of analytics tools. And it sort of splits the responsibilities among two constituencies, I guess. If you raised your hand and you were a plugin developer, you're there at the top. If you raised your hand and said you're an app integrator, you're responsible for Backstage deployment, you're sort of toward the bottom there. Plugins themselves are the ones that are responsible for firing those events, like a deploy or emerge or what have you. They use a core API to do so, which is provided by Backstage Core. It's been available for over a year now, I think. So it's there for you to use today. App integrators are responsible for providing sort of a concrete implementation of that API, which is itself responsible for translating the event into something meaningful for an external analytics platform, like a Google Analytics or a segment or an amplitude or something like that. The events themselves can be quite simple. I like to think of them as super, super simple sentences that just have a verb and a subject right now. The verb is typically expressed sort of in the imperative form, like navigate, deploy, merge. And then usually tack on sort of a subject to describe what that action was related to. So perhaps you would navigate to the catalog page or perhaps you deploy and then the name of a service or something like that. In order for a plugin to fire such an event, there is a React hook that you can use within components in your plugin. It's just use analytics, which returns an object that implements the analytics API. Typically what you would do is fire that hook in your component, you got that object, and in some kind of event handler, like a click handler, you would call the capture event method on that returned object. And on that event capture method, you just provide that action, like navigate or deploy and the subject, the thing that it was related to. Now, in order to help y'all who are responsible for backstage instances, there are some core events that are provided out of the box for you. You don't have to instrument these things. Navigating around backstage is captured as this navigate action. And it's sort of akin to a page view event in a traditional web analytics tool. And also clicks on links and buttons that come from core components library are also automatically captured with a click action. And then the subject is typically the text that was clicked on. Now, if we wanna tell a story, we can't just use simple verbs and subjects, right? We need to be able to describe these things in a little bit more detail. At Spotify, we've had event tracking in our backstage instance for quite a while. And one of the patterns that we've seen before we had this analytics API open sourced is sort of stuffing additional details into one of those two things, the event or the subject. It's not great because then it starts to become difficult to understand, well, what kind of deploy was it, for example? So, instead what we have are these attributes that you can pass as a third argument to that capture event method. And it's just a simple string string map. Doesn't have to be a string, I guess. It can be any Boolean number, that kind of thing. Where you can describe, like adjectives, the event in a little bit more detail. So, maybe you have a deploy subject of the service name, but maybe the deployment type isn't override, like a pin event kind of a situation. And in addition to providing a little bit more flavor, a little more detail to the event, you can describe kind of like an adverb where how the event relates to things sort of in the context around it. And so you can use this analytics context, sort of context provider, somewhere in your react tree above where the event was fired to provide details like was the engineer on call at the time that the event happened, for example. And just providing that context around the event, you end up with a JSON blob that kind of looks like something to the right. You still have that attribute, but you also have this context showing that the engineer was on call. Now, like there are core events, there are also core context values that are provided to basically all events. We provide a plugin ID for the plugin that provided the component where the event was fired. We provide an extension name where sort of, where in the UI that the event was fired, for example, a deploy button. And also on navigation, we provide a route ref so that you don't need to know the exact URL of the place that the event took place. You can get the route ref identifier. And all of this is provided for you out of the box as long as you use the core create plugin and create whatever type of extension methods as shown on the left there. Then the extension comes from the name of the extension at the bottom, like deploy button. And then the plugin ID value comes from the idea of the plugin provided when you create it. Now, for those of you, if you're not necessarily a plugin developer, you're just interested in actually having access to this data. All you need to care about is the analytics API interface itself. It just has that one method, capture event. And you just need to provide a definition for this method. By default, it just kind of gets black hole. Nothing happens. You could even provide just like a console log and it would just log it to the developer tools console and the browser. Or you could imagine like an ACME analytics service of some kind where you just push the event onto a queue and it batches them to some service somewhere. In reality, it's actually probably even easier than that. There are analytics modules that are open source that you can install and provide out of the box today. I believe Google Analytics is the one open source one that comes to mind. I've also seen a PR for a segment analytics provider that's still sort of in progress, but this is an area where you can absolutely get involved and plug in any kind of analytics provider that you use. PRs and issues and things, welcome. Or we can hack on it at Kupka. Now, where are we at today? Like Terrace was mentioning, there are things that we just don't have yet. It might be interesting to understand when people are creating, updating, rotating out secrets, but we don't have a way to do that in backstage right now. But when we do, we should absolutely instrument it. There are a lot of really important actions that users take in backstage today that are also as yet uninstrumented. So my goal here is that now that you are aware of this API and now that you see the opportunity ahead of us, being able to retell that story and data that will have much better event instrumentation in backstage going forward. So once again, we have the story, Bowie's journey on a Monday at 10.30 a.m. We now, everyone in this room understands the grammar of how to turn that story into data. And hopefully, my hope is next year at BackstageCon, we will be able to see how we can turn that data into something meaningful, something that shows that the investments that we're making into our backstage deployments are moving the needle. So, thanks, and I guess if there's time for questions, I'd love to take questions. So I really liked the look of the React analytics stuff that was up there. Is there something with a similar syntax that we might see soon from Backstage on the backend, some sort of analytics that is similarly presented and similarly accomplished from the backend side of things? Great question. Is there a backend-oriented analytics API in the works? Something that I'll mention is that we very intentionally design this API to use hooks so that they could really only be used in components, which is what real humans interact with. And so this API is very narrowly focused on user behavior. I think there's absolutely space in the backend for observability generally, right? That's not my area of expertise for sure, but I think there are efforts in that area. Eric, great talk. So I think a lot of this talk focused on the metric collection part, instrumenting the front end to collect it. And I think you alluded to this in the penultimate slide that how do you kind of like, great, we've collected these metrics, like what's next? What are, can you give us an inkling of like, how should we think about, you know, we have this sort of different bits of data that happen in the SDLC and an incident slide cycle and we can track this for all of our developers in our organization, but what do we do with that? What are some queries that you think would be important there? So I think the, you'll still be able to do some of the stuff that you do in the sort of traditional platform engineering way of collecting metadata. You'll be able to dig into dora metrics, like developer, like deployment velocity or mean time to resolution, that kind of stuff. I think where the real magic is gonna be in the future is sort of being able to combine some of those key points of developer experience with sort of the opposite end of the spectrum of how we measure developer effectiveness, which is sort of more along the lines of the space framework where you're looking at satisfaction, you're looking at how people are feeling about particular tools. I think there's a magical place where we can look at the data, we can look at how people are feeling and start to identify problem spaces, things to focus on. I'm not 100% sure what that looks like, but I think the future is really, really interesting. Thanks. Aside from, I guess, being agnostic of analytics providers, what's the advantage of using this API over using, let's say, Google Analytics, JavaScript SDK directly? So a couple of advantages out of the box is sort of like, you can absolutely hook up the GA SDK to this framework and be able to collect events. In Spotify, we directly spoke to GA in the past, and so we're sort of tied to it in a way that's very difficult to overcome. I guess there's lots of code with lots of events being fired and we're trapped on GA, essentially. So the agnosticism of the API actually makes it so that you can very easily shift the different provider without having to do a whole lot of work, hardly any work at all, really. That's probably the first one. The second one is just like, if you're working on intersource plugins or something like that, leveraging this API ensures that if you wanted to contribute that back to the open source, you'd be able to do so very quickly without having to reinstrument things in your plugin. Any other questions? No, people just want coffee. We didn't need analytics to predict that, right? No. No.