 Hey everyone, my name is Zohar, I'm the CEO of PORT. Today we're going to talk about how to whiteboard your software catalog taxonomy into your internal developer portal. So essentially, if you're hearing this webinar and you tried out backstage and you have a bunch of questions, so I'm gonna shed light today about what you'd be taking into account while you're modeling the software catalog that needs to represent your way of work and architecture. So in this webinar we will cover both introduction to IDPs, the seven pillars that needs to be included as part of the IDP that you have. We're gonna focus on the software catalog which is one of the main blocks of it and how you should choose the right model for it, how to bring data into the catalog from the various sources, how you can use backstage plugins to do that and after you did all this amazing job, what's next for IDP? So part of it is self-service for the developers. So it all started two years ago when Backstage was released to the world by Spotify's engineering, by their platform team and essentially they had Backstage as their own IDP that helped the entire engineering to use a unified interface where they can consume everything related to DevOps and development lifecycle in a way that they can understand, right? By both giving them a visibility layer into the developed components, something that they can comprehend and act upon and also being able to consume resources and consume services off the shelf and to become self-sufficient engineering. So there are a couple of building blocks to an internal developer portal and we are not going to going about every each one of them today, but we are going to more focus on the software catalog by just to give like a quick overview of what should be included. So software catalog is definitely one of them. The second one is the self-service part where you can allow developers to act upon the catalog and to consume all kinds of self-service actions like scaffolding a microservice, creating a development environment for five days, adding an environment variable to a service and so on and so forth. Self-store maturity is another important pillar where you can basically embed your organizational standards for development and make sure that they are being met in a way that your engineering can follow, right? Like by using scorecards, you can basically certify your software in terms of production readiness, security, privacy, compliance and so on and so forth. The fourth pillar is the automation for different workflows. So for example, auto-terminate the resource that was consumed by the self-service action or even use the API of the software catalog as part of the CI CD jobs that needs to deploy certain services and you might want to fill or pass a build according to the data resides in your software catalog. The last pillar for this block is the RBAC role-based access control where you can basically decide like what is the level of control each one of your users has, right? Like who can see specific data in the catalog and who can perform certain self-service actions and this is being used by the RBAC which is basically a key driver for developer experience because it reduces all the noise from the things that people don't need to see and don't know about. So they can just have what they need and put them on the golden path to get it, reducing the cognitive load and reduce all the noise behind the scenes. On top of all that, you have the ability and the power to create insights and reports for your organization and for your own needs, right? So you can keep track of deployment success rate, Dora metrics, MTTR and so on and so forth. And last, you have the UI API and chat ops interface which is the unified way to consume all these blocks in one single interface. So today we're going to focus about the software catalog and we are going to talk about which is basically and usually the first step into IDP and how you should think about the structure of it and how you can bring your structure into it. Okay, so basically what does a software catalog provide essentially, right? So it provides your engineering with a simple way to get answers to very complex questions, questions like, where is the log for J currently deployed across my infrastructure? What is the current running production version of a given service, right? Like who owns this microservice? Where can I find the API for it? And the list goes on, right? So you can imagine the software catalog as some kind of a visibility layer into the developed components that you know how your engineering understand them, right? So it can be anything from microservices, versions, environments, cloud resources, cloud accounts and the list goes on, right? So this software catalog is some way to represent all that with all the metadata that your engineering needs in one place in a way they can understand. And if you think about it, your software catalog will look differently than my software catalog and your friend's software catalog, right? Because every organization operates and makes specific decisions differently. So all of us are developing software but we all like have different architectures for doing that. So every one of us is some kind of a snowflake. And when you think of a software catalog, you want to be able to bring your way of work into that. So how can you reflect different data models with an IDP? So first you need to be very opinionated and not compromise on your way of work and you want to be able to reflect it directly into your IDP, for example, with backstage. So for doing that, you really need to think of your data model, which essentially there are a lot of commonalities with other organizations but essentially your software catalog will look differently than others, right? So you really need to be opinionated and not like think that you should change something for the chosen way of work. So what should be like baked inside your taxonomy, right? Into your software catalog. So the most famous component and probably the one that you will start with is the microservices, right? So you might want to represent microservices. You might want to also include packages and you want to include Kubernetes clusters and you want to include pipelines and you want to include a lot of different types of components into the software catalog, right? So you might look at this list and be familiar or think about like commonalities but I'm sure that there are a couple of components that are not on this list and you should be able to represent them as well, right? Like your custom resources. And all these kind of entities and kinds essentially has dependencies between them because we are developing software. Software has dependencies between one another and especially in a DevOps era, right? So you have like services that are running on environments and environments or using cloud resources that are hosted on different cloud accounts and so on and so forth. So to be able to bring your data model into the software catalog you need two main building blocks that backstage provides. The first one is types of entities, right? Like you want to be able to define the types that you want as part of the software catalog and you need some generic way to do that. So you need to wait to define like schemas of entities. So this is the blue part. The orange part is the relations. So you want to be able to make relations to reflect dependencies of the different software component that are being reflected in the software catalog, right? So these are the main two building blocks that you need to bring your own data model into the software catalog. Okay, so let's build our first model together and of course this is just an example as you're probably going to have a slightly different structure for your way of work. So the first thing that we are going to answer is questions about services, like who owns this microservice, where can I find the API docs for it and so on and so forth. So the first component that we'll have is going to be the microservice component. So we are going to create this entity, we're going to apply all the different properties that I identified, like the owner, the uncle, the links to the different documentations, the readme and so on and so forth. So this is going to be the first one. The second one is the system component. So we basically want to be able to know what are all the services that are associated with a specific system within my overall architecture. And then I probably want to know, like on a separate note, what are the Kubernetes clusters that I own and where do they reside across different cloud environments because I operate in a multi-cloud environment. So I'm going to create two other entities of a Kubernetes cluster and the cloud provider because I want to be able to see all the clusters across the different cloud providers because I use GCP and AWS for example. And you can see this arrow that indicates the dependency between the entity of the Kubernetes and the cloud provider and service and system. So this way I will be able to see the data with respect to the dependency that represent my way of work. Now I want to make some kind of a combination between the two because I can get like very strong answers for that. So I want to be able to know what services are running in production, for example, right now, right? And when I say production, essentially, because I work with Kubernetes clusters that represent environments for me, so I might want to see like what are the different services that are running across my environments, the Kubernetes clusters, right? So I'm going to create another type of component that is called a running service. So a running service is connected to service and Kubernetes cluster. Just because I want to be able to get the runtime data about each service that is running across the different environment and to have some kind of a metrics of all the services across all the environments and then I will be able to see relevant data about the version, the CPU and memory limits for each service and each environment and so on and so forth. So I created this type called a running service into my taxonomy of the software catalog. And I also want to know like what are the last deployment for example of a specific service to production? You know, just for root cause analysis purposes, I want to see like the threat of deployments across each type of service. So I'm going to create a deployment kind of entity that is associated with the running service because each deployment has a logical connection and the reference to the running service that is deployed, right? So I created this kind of deployment. So I already have like a way to see all the microservices, what are the services that are running as part of each system? I can see each service and where it's currently running in terms of the Kubernetes cluster. And I can also know like all the Kubernetes cluster that I own running across cloud providers and I can see all the deployments that took place and is referenced to the relevant service and environment that it points to, right? It's very powerful to have this kind of data model already. But I just want to add like one last component which is the package because I want to be able to see not only the services version that are running for each environment, I want also to see all the packages that were built as part of the deployment process for each service. Because I want to know like if I have an incident or a vulnerability found, I want to easily find where it currently running in a resolution of a service, a cluster, a cloud provider or in a service, right? So this is how I chose to architect my basic model. And this is also how I recommend you to think of like an initial use case for your software catalog. So essentially to accomplish this kind of data model with backstage, so you're provided with all types of ways to reflect this kind of entities and relationships. So these are called kinds. So you have five kinds that are provided. So you have the component, you have the resource, you have the API, the system and the domain, right? So for this simple example, we use the component type to represent the package and the service. The system, of course, the resource to represent Kubernetes clusters and cloud providers. So the building blocks that are provided by backstage are good to show metadata that is being brought by a GitOps way. So these are essentially manifest files that reside within your Git repository and are being fetched into the backstage software catalog. But to bring data that is more relevant to the runtime and to represent it in a nice way that connects to all your resources, you need to use the plugins to do that. So to bring runtime information about their services, for example, Kubernetes holds relevant data about it, you might want to use the plugin for Kubernetes and to bring data about CI CD, you might want to use the relevant plugins for that to reflect deployments. So of course, to be able to create this kind of architecture and to reflect it just the way that you want, it might require you to make some work and to adjust everything together. But it's something that is achievable by the model that is provided. So essentially, this is called the C4 model and backstage provide the five kinds of ways to reflect the metadata about your software. So the first one is the component. The component is essentially some every kind of piece of software from services to packages to backend service to data pipelines and so on and so forth. And it is being tracked by the source control that you maintain as a service owner, right? The second one is the API. So the API is an important part of the catalog and essentially allows you to make like to represent the connection and the boundaries between different components and the way that they are being consumed with one another, right? So you want to reflect like the API definition, whether it's a protobuf or GraphQL as a data schema or and to define like the code interfaces between them, right? So they need to be machine readable formats so they can be built for further tooling and analysis on top of that. The third component is the resource. So the resource are essentially like all kind of infrastructure pieces from S3 bucket, Pub sub, databases like anything related to your resources and specifically cloud resources. So by modeling them, you will have a better way to visualize resources and to create tooling around them, right? The next one is the system component. So, you know, as you have a lot of software component you want to be able to create some kind of an abstraction and to bundle a couple of resources and component out there one umbrella that will be presented as a system, right? So you want to be able to have some kind of a logical way to combine everything together and give it a name so you can be, you can essentially encapsulate couple of resources and components under one umbrella. The second one is the domain. So the domain essentially is also a way to encapsulate a couple of related entities. And it's very useful to create a group of couple of systems with short terminology usually around business purposes. So you might have a couple of business domain within your company and you want to reflect the software with respect to the business structure, right? So these are like the types of components that are being provided by the C4 model and by Spotify's backstage. But if you probably want to add more types of components into the software catalog, so you might want to extend the model and that are not provided by these kind of components to extend it to a more custom use cases. So you might want to have to write some code for that but this is absolutely fine. You can do that but it will not be provided by the out of the box building blocks by provided by the backstage, right? So, but I will not cover this in this session. So let's talk a little bit about ways to ingest data to the catalog using backstage. So for most components you will only be able to bring the data using the GitOps operation. So for packages, services, clusters and so on and so forth you will need to maintain like Git files as part of your repository and this will be automatically reflected into backstage software catalog, right? So mostly the components that are being provided by the C4 model are good in order to reflect metadata about your software, right? So services, clusters, packages, things like that. For runtime use cases like running services and live deployments, things that are more like live in the femoral you might want to use the plugins for it and think how you want to connect them and how you want to reflect this kind of data into the software catalog. So plugins will be better for live data and the other types of resources will be better for metadata and the GitOps way, right? So let's talk a little bit about the segmentation of the different plugins that are provided by backstage. So you have the cloud, the CI CD, the GitHub, the Kubernetes, which is like a standalone and the SSO. For the cloud provider, for the cloud plugins you can bring data about cloud formation, lambdas, pipelines and things like that across the different clouds that are being used by your organization. The second type of plugins is all the plugins for the CI CD, for Jenkins, like different like Tecton pipelines, Travis CI and so on and so forth. And by the way, like these kinds of plugins are more for the data ingestion. There are other plugins for visualizations and views. So under the plugin world of backstage there are a couple of segments like types of plugins, but this is more about the data ingestion plugin. So you have the CI CD, you have the GitHub, of course, if you want to reflect data about actions, pull requests and all kind of like GitHub out of the box information that you want to include. You have the Kubernetes, which is an opinionated plugin that gives you a way to show like clusters and logs, live logs about clusters and basic Kubernetes data. And it will like show you one form and the way that the Kubernetes cluster will be reflected in your backstage instance. So you can use that and you can also reflect ARGO CD flows as part of it. And then you have the SSO, which is not so related to the data catalog, but it's still important to emphasize that it also provided by a plugin. So you can use your Octa and to integrate with the LDAP protocol in order to allow a single sign-on solution for your backstage instance, right? So up until now we spoke about the software catalog. So the next phase of the software catalog will be, after you did software catalog is the solve service part. So essentially this is usually like a natural way to proceed once you've done that. So essentially you want to allow all kinds of actions about it and this is the next phase of your IDP. And what we recommend is proceeding with your backstage installation. So just to conclude everything that we've went through. So we talked a little bit about what is an IDP. We took a deep dive into the software catalog, how to model it, what are the building blocks that are provided and should be provided as part of your IDP solution, specifically about backstage, how you can use backstage building blocks to architect the model. And we talked a little bit about the C4 model. Another way, another thing that we talked about is the way that RBAC can be used in order to adjust views for your own personas and think about how you can take the next phase of the IDP into the solve service part where you can allow developers to act on their own and to be a self-sufficient engineers. So if you have any kind of thoughts or comments, I will be very happy to talk about it. We are really excited about the field. We think backstage is great and it's a great community and a great open source project. So really feel free to reach out by email to me at Zara get port IO and feel free to try it out. It's open for a solve service and you can try a get port IO. It's open and free for use. Thank you so much. I hope you enjoyed this conversation and please reach out if you have any questions.