 So how many in here is actually used to take insights plug-in on the open source backstage? How many has deployed it? I have two hands. That's nice. It's maybe maybe a few more after this talk So we're gonna talk about tech insights a little bit We'll show why we built it for roadie first of all because we wanted to roll that out for our customers What are the components that are now available in roadie and how you can implement those yourself in the open source? Backstage as well It will go through what we have currently in place and Why we built it and then I'll jump in and I will talk about What is in the open source backstage? How it was architected initially want what what are the ideas behind it? And then we'll go through what you can do to actually use that your advantage in the open source project or your own backstage deployment So maybe maybe more than two people after this talk Can you hear me? Thank you see so as you see mentioned I will be first Kind of giving you a brief overview why we decided to develop tech insights and then follow examples of how do we use it? so let's first start by Telling you how it all started so When you first started using backstage Everything was great. We didn't have a lot of service services, of course It was maybe around 10 services that we needed to check so we were able to manually trace progress in all of these services and We were able to kind of do all sort of tasks and see if some of those services were successful and some of them were not however over the time the Number of our services increased so we started having more than 100 services More than thousand we didn't have million services, but you see my point So we figured out that manually checking the things wasn't Easy cake anymore, so we were looking for an automated way to Keep us happy because we figured out that we weren't as efficient as before of course and We had a couple of main goals in our mind So I will just guide you through them So firstly we wanted to measure the quality of software now So their quality may be a term that is not I mean it is a common but a common understanding of this may not be the same for all software engineers, so by introducing scorecards we were able to Just kind of Reflect on those checks that matters to us the most So for example here you can see several checks such as presenting sneak or number of critical Severities or if none of them are found but basically check themselves are a single unit of computation and they Produce result based on that value now scorecard result would be like Accumulation of all of those results and whether you decide to use it on an entity level as shown here or Catalog level it is clear that by having this overview It's easier for engineers to focus only on those services that are not meeting Requirements a set in scorecard so that they can improve accordingly Okay, secondly, we wanted to visualize what matters the most So we strongly believe in the power of visual components in order to provide better visualization. We kind of developed graph card which you can see here and This card gives you and provides you the information about fact value over the time range So here it's one month period, but of course it can be anything else what matters to you, but apart from that we have also Developed a big number card and the main difference is that this card will give you a current values for the facts now both of these are quite handy and we use them because Like the biggest power they have is that you can combine multiple facts from multiple factor trivers and display all of them in these Cards and lastly, but of course not least important We know that in organizations different teams will be working on the different parts of the platform So by developing team level report We have emphasized emphasize the accountability for services within teams So that they know which services they own and what they need to do in order to Increase the quality in those services and to meet requirements set in the scorecards However, since we know that we don't live in a world where all developers can jump in on the task as soon as it kind of pops up Currently we are working in a feature that will provide real-time feedback alerts and subtly nudge developers and teams in general So they know what they need to do in their services So the requirements are met so these nudges can be whether on slack or github or whatever tool you use but These will be used like as I said kind of nudging the team in a subtle way so they know what they need to do I Have mentioned several terms in my Part of presentation and those are checks facts and factor trivers So I'm sure that you'll wonder how they all fit into a place So now I will let you say take over and he will tell you more about Architecture behind tech insights and hopefully you will have like a big better idea how it all fits into a place Cool. Thank you So if we think about Tech insights in general it is a plug-in that needs a lot of work because Data sources that you are using to actually retrieve that insides data is usually something custom You can think about data sources being github or in our case sneak is very good data source for us But then you have sources like internal CI CD solutions where you want to pull the data in from and those are usually Very specific to the implementation detail Over the next few slides. I'll go through how you can actually do that with your own tech backstage implementation Deployment and what are the missing pieces that you need to actually implement that available for you? Yeah, when I think about taking sides, I usually think about three layers of that one So we have data layer which contains like the storage of the data as well as the retrieve Or how do we get the data into the storage space? Then we have calculation which I usually divide into two different bits checks and then manipulation other aggregation stuff and Finally visualization we saw a few visual is visual elements here already that we have developed for Audi there are a Component on the open source backstage as well that you can use to visualize your scorecards No, yeah, we'll go through these one by one now. You will see some code as well First let's think about data So if you think of backstage plugins a lot of the implementations that we have seen They usually fetch the data from some third-party source and then display that so you'd had the current data available for you And that would be pretty much it you fetch that at runtime and show that immediately There might be some cash that stores the data in the database, but that's What we've come up or what we've seen it's usually not that long-lived take insights kind of flips as it flips that around because we want to build the tech insights to be also metric and Analytics solution So we want to have the possibility to do reporting and analysis on the data That way we can then store the data for longer periods of time into the tech insights database within backstage How it's currently done we have two different tables in there the first table is called schema The schema this defines the shape of the data that you want to store in that insights so if you take a look at the JSON blob in there or actually it's Java and typescript code The schema itself it actually has multiple fact items within it So you have a fact schema that encompasses multiple fact items and this can be considered that okay all of these fact items come from the same Data source whether it's nick or github or your internal CI CD solution or something else All of these fact items within the schema they have a type So currently the types available on the open source back dates We have few numerical types so in the interim floats and then strings and sets and timestamps And in some cases you might want to store like complex serializable objects like JSON blobs as facts as well So when we have this fact data available then we can work with that one The second table that we have is then the fact itself So this is a pure one-to-one mapping to the fact schema that we saw earlier You can see on the last three lines. We have the same same shape, but now we have the actual values in there as well The two other kind of interesting bits in here are the timestamp so all of this data that we are storing in the big tech insights database is time series data and that way we can run analytics or graph it out or whatever we want to do with it and All of this data also has a hard reference to an entity In this case the reference is a component called sample service But the reference can be anything else that you can think of it can be a resource You may want to do analytics on your AWS resources that you have mapped in the backstage or maybe you want to Do analytics on your employees you have user entity and you want to see how many lines of code They have committed over the last two weeks Maybe do races based on that No, don't do that. You can it's possible to do but don't do that the top field there ID That is then the last terminology that we'll introduce here. So factory driver in this case. We are calling it sneak factory driver factory driver itself it It is a logical concept That encapsulates encapsulates both the schema itself. We see that online three as well as the logic that we need to use to actually get the data from third party and Map it into some kind of shape that we have defined in the scheme itself These factory trivers that we have in backstage The shape of them is usually like this You need to define a schema and create a function for it and then register it to have some kind of schedule for it So it can run on a loop contact third parties fetch data store it in the backstage big database And the big database the whole data model looks a little bit like this So we have the factory drivers to scheme us for it and then a big database That stores all the facts if you squint hard you might see some similarities to start scheming there But we are in very denormalized world. So it's not really you need to squint like really hard to actually see that The database implementation in backstage currently it's using whatever you want to use It is a skew light or postgres, but it is open to extension. So if you want to actually dump it into Cassandra or Dynamo dp or anywhere else you can bring your own because the API API surface of these interfaces that influence these database Calls is fairly small There's like five six methods that you actually need to implement and it's just mostly dumping JSON into somewhere else or retrieving it cool, so that is Data layer repeating the slides Next one we can talk about calculations Now that we have the data available for us. We need to do something with the data There are two categories of calculations that I've identified the first one is Something that was on the MVP that we wanted to make for the open source backstage So we are purely talking about scorecards and scorecards are built based on checks Checks are Simple checks. So you you have a target value. You have a fact value in here Fact is called severe vulnerabilities and target is one and then you have an operator that can compare those two So the default implementation in there is changing rules engine that always produces a true or false value This is by design. We want to produce true or false values Because that way we can then map those into or we can roll those up into scorecards and see That okay, this scorecard has five checks four of them succeed and it gets a grade of 80 percent or whatever 4.4 four out of five We'll talk about checks how to create those later a little bit more But first let's talk about the other calculations The raw fact data is usually retrieved from a third party So you contact your third party and pulling party map the data and dump it into the database But you can actually create factory treeverse that use the already available fact data in the Take insides database you can query ranges of that and then run calculations aggregations run arithmetic calculate like Average is overtime window or min max values or most common value Percentiles and stuff like that and then store that back into the same take insides database as Second layer of this fact data that way when you want to see that What was the average over one week period today? You can see that but you can also go four weeks back and see what what the actual value was at that point You don't need to do additional calculations You can do additional calculations though because all of these can be done in front I'm also because these calculations are usually fairly simple and you can implement these either at the router level or Possibly even directly in front end when you get the date So those are two layers or two Flavors of calculations Neither one of these is actually necessary. They are available there So if you want to use checks there are endpoints where you can dump in a check identifier and fact identifier It'll it'll give you a result or if you want to do the aggregations or other Other calculations with the fact data You can just create the factory driver and dump it back into the database But you don't have to do that because you can use the fact data directly from the database and visualize that So the last layer Visualization then we saw few of the elements that we have built for ODI There are few others that we have ideated currently the implementations that are available on the open source backstage side are Limited to scorecard implementation. So scorecard we saw that or already and as a concept It is fairly simple. It is a full end-to-end solution that you can use nowadays already. So if you want to See how secure your service is you can create checks that checks, okay, I have less than five vulnerabilities or similar items like that and then create the scorecard out of that one tie that to an entity show that on the catalogue page of your Entities and teams will be able to see if they are getting a good grade or not the next item is then Using the fact data directly. So endpoints are available in there. We see a talk about visualization in Two hours, I think which will be very interesting because I believe that the tech insights database and the facts that are stored in There will be very good source of data for these visualizations that we will be seeing and I think there will be a lot of Lot of good coming out of that one when we combine these two talks together So yeah, you can do ranges. You can do charts and whatnot. Also aggregate visualization It's the same same thing of a Same thing using the fact data directly But different kind of items to be displayed. So you can do pie charts you can do You can do average numbers or the big number card that we saw earlier You can even create infographics based on the fact data that you have calculated for your tech insights all of these three these are Very tight to the catalog right so you'd usually display these ones that are tied to an entity you'd see that on the catalogue page and you would Maybe create roll-up. So we saw a roll-up of scorecard earlier a team view of Scorecards, but mostly these are very tight to the catalogue itself The other bits that I called others simply You can actually do more with the tech insights data that we have in the database So if we think about the original demo video that is on backstates IO slash demo, right in there We saw migration from I can't remember what it was. Let's call it Python 2 to Python 3 and Tracking that how teams were faring. What was the score of the team and how that migration was going or Organization-wide you can do that with the tech insights database using the scorecards because What the scorecards or initiatives in this case would be is just the scorecard with the start date and an end date The other bits that we saw in the demo video as well is to actually display the fact data directly in a pure format So on the demo video we saw a table view that displayed Some service what were the dependencies of that and what were the versions of those dependencies? So you can think about okay I want to map all of the dependencies of my Node.js microservice What you do in there is you retrieve the fact data for the Docker file that used to deploy that for example You retrieve all the fact data from the packets JSON file to see what the express version is and then you can query the fact data And display that in a tabular format for the whole Organization or team and then you can have an overview to see who is Early adapter who is lagging behind and who needs to upgrade their versions or whatnot? So all of this is possible now But how do we actually do that then? There are three steps that you need to implement I said earlier that it is very code heavy to actually get Insights up and running a lot of backstage is because you do need to implement your own code scuffled code to the dependencies up and wire things up. Maybe the backend system will help with that I hope it will it looks very good already But back to tech insights we create tech insights We want to create three different items first one is fact to driver You find out what your third-party source is for the data whether it is GitHub or sneak or internal CI CD tool then you Identify what is the shape of the data? So you create a fact schema for that and Finally you create the actual logic the simplest bit make a call the third party modify the data and dump it into the tech insights database So now that we have the data inside we can create checks the functionality in here That is listed. It's very very tight to scorecard So these current implementation of tech insights it enables you to create scorecards on your backstage implementation and create these corecards you need to create checks The default implementation currently is called Jason rules engine and that was chosen because it is kind of powerful It gives you easy tools to get started so you can create easy checks like okay number two is less than my fact Which is simple to understand that you can also do boolean logic in there you can do and and or Combinations on Jay Jay's rules engine and The kind of the most interesting bit that I think of when I think about Jason rules engine is that it has the ability to actually compare one fact Against another fact so you can imagine an implementation where your DevSecOps team Maintains a Google Sheet where they Store values of okay look JS they look for J version needs to be a little higher than Whatever the vulnerable version was you can use that as your fact source Call the API of Google Sheets and store that as a fact and then you can create the check that compares that DevSecOps provided fact against the actual dependency version in your entity That way you will have completely dynamic checks You don't need to modify code anymore You create the check once and then you need to modify whatever Google Sheet or maybe another get a proposal story with file values and stuff like that Finally when we have these checks available for us we can create scorecards based on those so There is a plug-in to create scorecards. It is displaying these check results it make out makes automatic calls to the back end to check endpoints and Gives us the values displays. What is the? What is the grade of the scorecard that we want to see? Cool. I think with those you should be able to get started So three items to think about creating factory drivers creating checks factory drivers are typescript code checks are JSON structure more or less so you can create those outside maybe store them in the database because that way It's might be easier to handle and you don't need to modify code And finally when you have those two available You can start creating your scorecards if you have checks for security Maybe you have static analysis checks that your code quality needs to be good enough And then you can use those as scorecards and show how well your teams are fairing or how well the entities are actually Conforming the standards that you want to set Yeah, I believe that's all what we have prepared for today However, keep in mind that this project is still like in its early stage. So contributions are very welcomed So here are a few ideas. How can you all get involved and how can you contribute? But we hope that we have sparked the interest in taking size plug-in and that if you haven't so far you will Kind of try it out now Unfortunately, we don't have any more time But we'll be having I think time for a couple couple questions So make sure to raise your hands But if there are still if there is like interest or you need to know something make sure to stop by our booth We'll be here whole week. So both me and you see we'll be there most of the time And we will be happy to answer all of your questions. Thank you for your attention Thank you Thank you, are there plans for it to be available hosted on roti the Tech insights. Yes, that is something we are building currently on roadie site So we have we have automated tools to create these Factory drivers and checks for the UI already available for us More questions Hey, how would you compare tech insights to something like data doc which also has the concept of like monitors and alerts? based on Kubernetes metrics I know does your also have a date data plug-in for backstage. Yeah, could you also use something like that for service health? Let me see if I understood the question correctly. So how it compares to like Kubernetes metrics or data doc or something similar, right? Yeah Tech insights isn't so if we think about those metrics in general the cadence for those is usually much smaller So you can talk about like minutes or what not when you think about tech insights. It's usually creating checks or fact retrieving facts that might take days to change and Visualizing those is something that I mean you can do in Kubernetes for example But actually having the historical data what it was ten months ago is useful because you can see that okay over time our team Gained a scorecard from 20% to 93% They they're severe vulnerabilities dropped from 80 to 3 over the last two month period So it's similar solution But mostly data source would be different and then because you have data sources from different third-party places So you can have data doc and the Kubernetes metrics in the same database You can merge those together and maybe compare them or create aggregations based on them. I Hope that answered the question. Yeah, good and we thank you for your session. Thank you