 Hi, everyone, and welcome to an on-demand session for the CNCF. My name is Moor. I'm a lead architect at Ford. And today we'll be talking about what do we mean when we say data model. We're going to learn how to think about what needs to be inside your developer portal. Should be a very cool session with a lot of very relevant information. So let's get started. OK, so what are we going to talk about today? First of all, we're going to go through a short introduction to the world of platform engineering. Then we're going to talk about the core pillars of a developer portal. We're going to talk about what is a data model? Why do you need one? How do we define a data model MVP? What is a blueprint, which is a port component? And we're going to go over some practical examples to create a core model and learn how to extend that model once we start getting started. OK, so first of all, an introduction to platform engineering. This became the rise for developer portals and internal developer platforms. Basically, as time went on and the DevOps boom passed, we reached a point where developers have so many tools. We need to keep in mind so many things. We need to keep track of Kubernetes and Argo CD and their cloud provider and their Git repository and all of the different tooling. And it just became a very cluttered mess. It became very hard to find all of the correct tooling, all of the correct pieces of information. It became very hard for developers and different personas in the R&D organization to keep track of everything. And this brought rise to the movement of platform engineering, whose goal was to make an organized platform that brings all that information together, exposes it in a convenient way that makes sense and gives you those tailored views and also gives personas more independence on what they do, giving them sole service, giving them an organized catalog, giving them scorecards to understand exactly how they're doing with the services they're responsible for, with their development process and so on. This is a process that was pushed by companies such as Spotify and Lyft. And now it became a very general movement, which is also heralded by Backstage, which is a project of the CNCF. And now we're going to talk about how to build your internal developer portal using a data model. So first of all, let's go over the core pillars of a developer portal and what makes a good developer portal. First of all, we have the software catalog. The software catalog is where all of the information flows into. This is the visibility layer that gives us insight that one pane of glass into everything that's happening in our organization. The services, the deployments, the CI CD jobs, the resources we have from our cloud provider and how they all connect to one another. This is that one pane of glass that one tab developers need in their browser to get the answer to every question they might have, such as who is the current on-call? What is the documentation or read me for a given service? And so on. Then we have the solve service layer. We want to give developers more independence. And nowadays it is very common that developers would require certain resources from their DevOps. They might require a new cloud resource, some permissions. They might need a temporary development environment and so on. And usually that revolves around finding their DevOps or sending a ticket or sending a Slack message. And usually what happens in that case, the DevOps will simply receive the request and run his own pre-made script that already performs that action and tell the developer, okay, it's ready, you can go ahead. We want to save that process and we want to give developers the option to simply consume those pre-made scripts and pre-made actions through a self-service layer that contains all the necessary role-based access control and guardrails to make sure developers gain more independence but still have a safe way to consume those self-service actions. After that, we have workflow automation. As time goes on and your developer portal becomes more robust and caters to more and more use cases, the information that it holds becomes even more precious and even more valuable, which means that not only personas such as developers and DevOps and platform engineers can make use of it, machines can make use of that information as well. So we're talking scripts, CICD, deployment processes and so on, they can all make use of the information in the software catalog to make decisions, to take control of the different processes and maybe put stops where they are needed and so on. So that is the workflow automation layer. And we have scorecards. Scorecards are a way for organizations to push forward their engineering excellence and to keep track of how they are standing with standards within the organization and also standards of the industry. So they are able to configure scorecards for the different services, for the different clusters and cloud resources and say, okay, I want a given microservice to only have less than five open tickets or I want him to have no critical security vulnerabilities and those scorecards will give us that overview of how a service is doing, how is it standing with regards to our best practices and to our goal and we can also push initiatives to improve the standing of that service and tell the responsible teams to take care of those misalignment and anomalies to make sure that all services are in good stand. Now, of course, the developer portal is surrounded by a very strong role-based access control layer which makes sure that only the proper personas see the information they need. It allows us to create those tailored views and not exposing too much information to personas who don't need it to reduce that cognitive load and also to make sure that certain expensive or dangerous requests or self-service actions can only be performed or approved by the proper people in the organization. And of course, the developer portal is comprised of a very strong interface so a very convenient way to consume the information in the portal would be either through the UI or through an API or through something such as a chat apps or chat bot and alongside with that with all of the information that we have in the portal we also have the R&D insights and reports layer which gives us visualizations over the information over what developers are doing in the portal over what actions are being executed and their status and so on. And since today we will be talking about how to create a data model and what should our data model contain. The primary layer that we're going to talk about today is the software catalog. Now, what is a data model? So a data model is a representation of the layout and architecture of the different components that make up your environment and infrastructure components. The goal of the data model is to make it easy to understand the interdependencies in your infrastructure and how your SDLC tech stack comes together. The idea is to create a model that shows how the different pieces in your infrastructure connect to one another how they speak, what are the dependencies what are the interdependencies to be able to create that proper cloud map showing you everything from the first line of code to the service running in your cluster in your cloud provider and all of the pieces in between. Now, why do we even need a data model? So every organization is a bit different. Everybody have a different architecture they use different set of tools they use a different cloud provider the frameworks might vary and all those changes might be very slight or very broad. Some organizations are completely different from one another and so we need a data model that is flexible and customizable because the data model should serve you and it should make it possible to build the best developer portal for your organization and for your needs. So that is why a flexible and dynamic data model is crucial. Now, the data model should reliably describe your infrastructure in a way that resonates with your organization terminology architecture and workflow. That means it can't be a close set of resources which you simply ingest information into it needs to be customizable and it needs to be able to have custom connections and custom properties which match what you care about you would like to track in your developer portal and in your organization. Now a data model is a way of addressing desired use cases for the internal developer portal tracking incidents, managing microservices resolving vulnerabilities these will all create different initial data models and as we'll see going ahead we're gonna see that we start with a core data model and we're gonna explain what a core data model is but we'll see that the key is to make an iterative process over the data model and continue to extend it according to the needs of your use case users and their use cases. So how do we go about defining a data model MVP? First, it might seem a bit daunting to create a data model but it really shouldn't be. The core thing that you need to remember is that while the data model is the starting point for your software catalog creating the model is an iterative process. As time goes on you will continue to add and modify and change and delete components in data model but the core should not find a new and should not stop you from starting to get going and creating that initial portal. The idea is to get started find a particular use case which your developers want to solve which your developers have a pain point in and implementing the initial blueprints that represent and solve that use case. Now notice that I mentioned the term blueprints a few times we're going to explain what that is in the next slide so hold on. Now once the initial use case is implemented the goal is to continue to extend the data model and implement additional use cases and as time goes on the demo will continue to grow but again because it is such a flexible object it is perfect for that purpose it can continue to adapt to your needs to the needs of your users. So we talked about blueprints a bit let's explain what a blueprint is so blueprints are schema definitions for any type or kind of asset in your software catalog blueprints are customizable components which make up the schema of your data model and blueprints are made of properties so what does that mean? You can think of a blueprint as a schema for a table in a database you can add all the different columns you want those are called properties and it should support all the major types so it should go from the standard primitives we're talking strings and numbers and booleans all the way to arrays and the ability to embed markdown and users from within the portal itself and also embedding iframes to get all the information and get the most customizable and tailored component and schema for your organization so in here you can see an example for a service blueprint and once we start diving a bit deeper into the core model you're gonna see exactly why that service blueprint is important and why it looks the way it does and we're gonna dive a bit deeper into that in just a few minutes so let's talk about creating a core model the core model is the initial set of blueprints used for your first developer pool of use case the MVP now what that means is that this is the first set of blueprints which we're going to implement it's going to be the starting point from which you will continue to add more blueprints and more use cases and the idea is that as your developer pool of growth in use cases and internal adoption your data model will also grow and that initial set of blueprints will evolve but in addition to that you will have more blueprints added around it and that will create a more complete map and a more complete developer pool which contains the answers for all of your internal use cases and all of the different personas that might consume the pool now you can see here the core data model that we will be talking about we're gonna start diving a bit deeper into each and every one of those components like I said those are called blueprints and we're going to see why their flexibility matters so much and we're also going to see why we went with this for the core model so our core model is the SDLC core model and now let's talk a bit about motivation and explanation for why we went with that so a very common core model for developer portals is a model for the software development life cycle and this is something that we see all the time with different adopters of developer portals with different use cases we're seeing it with users of port we see it with users of backstage we see it with users who implement their own developer portal internally and the reason is that this model is very easily extensible and this model also serves the core abstractions and visualizations portal users need 99% of the time really every time we go and talk to a potential customer and every time we go and talk to someone who's interested in developer portals or feels he has the need for a developer portal in his organization it always revolves around the SDLC use case first because that is the biggest pain point so why does the SDLC core model contain it contains three blueprints the first is service then we have environment and finally we have running service and now we're going to dive deeper into each and one of those specifically and understand what is its goal and also what it contains and what information it exposes for us so let's start with the service blueprint the service blueprint is used to represent a static code repository and it's related metadata for a microservice it is very common to use the service blueprint for metadata information and we have some example properties here for the service blueprint is the URL to the service repository the team responsible for the service the language of the service the readme documentation and the service architecture diagram now remember we said that blueprints are customizable objects and that developing the core data model and the data model itself is an iterative process so what you're seeing here is a suggestion for an initial core data model but since every organization is a bit different your core data model might differ it might have an additional blueprint or two it might have two properties less it doesn't matter, the idea is to start with a single use case that fits you and will answer the pain points of your developers and of the consumers of your developer portal and we're going to see why we went exactly with this initial set of properties in the service blueprint in just a few minutes when we see the end result inside the developer portal but use this as a reference, use this as a guide and also keep in mind why we even want all of this information so when we're talking about a service when we're talking about a static code repository we want to have all the necessary information in our developer portal we don't want to run around too much and have too many open tabs we want to keep track of everything from a single place, a single pane of glass and the service blueprint in the core data mall is exactly that so we're talking the URL to the service repository means we don't have to search around or get provider too much finding the team responsible for the service as the team property means that we always know who to contact if there's an issue, if we want to learn more about the service the language of the service could make it very easy to understand if it can easily be picked up by us if we need to add a new feature if we need to track how many services in our organization use a particular language or maybe make sure that it's using the latest version that has no vulnerabilities the readme documentation makes it very easy to onboard new developers and also for existing developers to pick up the service and start using it if they need to consume it or if they need to contribute to it and the service architecture diagram makes it very easy and easy to look around documentation for the proper architecture and the latest architecture it's all seen right there in the portal available for viewing instantly now let's talk about the environment blueprint so the environment blueprint is used to represent an environment where microservices are deployed and resources are hosted use the environment blueprint to keep track of the environments where services are deployed and also gain visibility into the different environments maintained by the organization so as organizations grow and today we are in a very cloud native oriented development methodology and we see a lot of organization having multiple cloud accounts multiple cloud environments we work in multiple regions serving customers from all around the globe and so keeping track of all these different environments and their components and their different cloud resources can become very challenging so the idea for the environment blueprint is to hold all of that information to be that high level grouping that gives us the visibility into everything that it contains so in order to do that here are some common properties for the environment blueprint we're talking about the cloud provider the environment is hosted on which can be very beneficial for companies that are in the transition between different cloud provider or maybe they are working with a multi-cloud keep track of multiple cloud environments from multiple cloud providers we're talking about the type of the environment so is this a production staging a test a QA environment and so on and the region of the environment to be able to see ok what regions are we working on are we missing some regions for some of our clients are we upholding all of our compliance requirements and so on so all of those things can be implemented through the environment blueprint and also as I'm going to discuss at the end of presentation the environment blueprint is also a good starting point to look into things such as FinOps and understanding the costs of an environment hosted on the cloud now we have the running service blueprint the running service blueprint is used to provide the runtime context for a service running a given environment now that is important because a service on its own is just static code we said it's just a repository it has some metadata the language responsible team and so on so that is all nice and well and in addition an environment is a collection of resources telling us what is what different cloud providers and cloud environments are we using if we're on AWS or GCP or Azure and what regions are we on but the running service is meant to bring some runtime information and make the connection between those two blueprints so the running service blueprint gives us the option to answer the question how is my service in production looking whether it's stats what are some URLs to its dashboards and monitoring dashboards and logs what is the current API definition and what is the URL to consume that API and so on so all that information is runtime information that we can only get from an actual service running in an actual environment and this is why the running service blueprint is so important because it gives us that runtime context that a service entity from a service blueprint on its own does not give us and it also gives us information on where is this running and should we care if it is not healthy for example it's a production service maybe if it's a test service and we can take care of that tomorrow we don't need to wake up the on-call developer so all that information is contained in the running service blueprint and that is why it's so important so some common properties for the running service blueprint is the commit hash of the deployed version which helps us understand what version is actually deployed in production right now or what version is deployed in tests right now and we would pretty soon want to promote to production we can have the URL to the Grafana dashboard to the Prometheus dashboard and we can have the Swagger API reference for that running version so for example if we have that version running production we can have our developers testing with that version and learning how to use that API with their own services in their development process and add more features now after we have all those three blueprints in place and we have that core model what is the actual result so as you can see here we have those three blueprints and we also have relations between them this is what gives us that strong contest this means that we can look at an environment and see all the different running services so we can see everything that is currently running in production but we can also look at it from a service perspective and see for a given service where is it currently deployed and for example if a service currently isn't running in staging and it's only running in production so we have no way to actually reliably test and validate changes before we deploy to production and we can also look at a given running service and look at our running service in production and look at its metrics, keep track of its direct access to the logs and to the API reference so that we can validate that everything with the service is okay it's running properly, it's using the expected amount of resources for example CPU and memory and so on so the SDLC core model provides us with the basic layer of visibility into our services and environments with the proper runtime context and while this model is only the core again this is the starting point for developers with insights into real services, the current state of the system and run time information all in a single pane of glass and when I show you the complete example with the information already ingested into the catalog to see exactly all the different views and different visibility insights that we can gather from this core data model so now let's talk about extending the core model remember that the core model is only just the starting point and designing data model for your developer portal is an iterative process so we start small, we start with the core model but then we continue to iterate and build on it and extend it according to the needs of our users and to the different use cases that we want the developer portal to tackle remember that the goal of the developer portal is to be the ultimate pane of glass that one tab that developers and DevOps and platform engineers all need in their browser all the information of the tailored views so they don't really need to look around too much for different things now once you have an initial core model in place for example the SDLC core model that we suggested you can start extending it with additional information so some common examples for core model extensions are adding CICD pipeline data ingesting all the different deployments, the builds, the service promotion tasks that we have to bring even more run time and even more process information to understand exactly how services change over time and also for example if we have some sort of issue in our CICD process or if we have a lot of builds that are failing and figure out the root cause we can talk about application security data we can look at outdated libraries or security vulnerabilities and make sure that we're compliant with the organization standards and requirements we can talk about project management data, tickets, sprints projects, understanding how a certain sprint is going how many features have we already implemented and have been deployed for a given service in a given sprint and so on we can talk about alerting and tracing data so incidents, alerts on-call paging and so on being able to page the on-call directly from the portal if there is an issue that we noticed given service for a given running service directly from the portal means that we have all the information in one place and then even when the on-call wakes up at 3 a.m. he simply opens his developer portal and he has all the information that he needs straight away he doesn't need to look around too much and solving the issue becomes a much more effective process and as I mentioned we can also add FinOps data such as how much does an environment or service sprint or a given feature cost to develop and to maintain and to host and we know that that is highly critical information that organizations really want insight into today and a developer portal and a core data model that can be extended to contain that FinOps information is crucial for organizations and they really want to see that and that is a very popular extension of the core data model and of course as we said this is an iterative process you want to continue extending you want to continue listening to feedback that you receive from your developers from the personas in your organization remember that the core goal of the platform engineering developer portal movement is to improve developer experience to improve efficiency and productivity and make lives for developers better and easier so ask your users talk to them request feedback understand their pain points and extend the core data model according to that so before we move to the questions I want to show you exactly what a core data model looks like when it's implemented so let's take a look here in the builder page we have the three blueprints that we mentioned for the core data model so we have the running service environment and service blueprints and as you can see they contain all the properties that we've discussed and remember since we're talking about blueprints which are simply schemas then we can always go and add properties as needed and we have all of the major types including markdown, swagger UIs and embedded URLs and so on so this shows you the flexibility and the power of the flexibility in the developer portal and now let's take a look at the actual result so if we go to the catalog first of all you'll notice that we have a page for every one of our blueprints so in the service page we can see our different services we can see the URLs, we can see the languages we can see a link to the architecture diagram if you look at the environment we can see our different environments we can see that we're using AWS and we have three environments for production, test and staging and we have our running services, we have a few instances of services running in the different environments for the different services and again remember this is a small example just showing the core model but let's take a look at what visibility this already gives us so first of all if I dive into a specific service what I can see is first of all this is all of the runtime information provided by the running service blueprint and the running service entity for the authorization service running in production and if we look at the related entities section we can actually see that this service is running in production and we can also see the entity for the service itself so we can go to the repository URL we can look at the language, you can look at the architecture diagram and so on and since the data model and developer portal is meant to show all the interdependencies between the different components we can also take a look at the graphical view and I will send exactly okay how does this running service connect to the rest of my infrastructure now remember that when you extend the data model further and add more blueprints you will see those appear here so we'll have a complete cloud map showing everything going from the single code line all the way to the running cloud provider, the cluster, the specific Kubernetes container running that service and all the information it contains and again as we said we have the swagger API and we have the logs dashboard URL and so on and if we go ahead and take a look from a different perspective if we take a look specifically at the production environment we can also see all the services running in production right now so this again as this list fills up and we bring more information into the developer portal we will have a complete list of services and we'll be able to see exactly what is currently running in production, what is its status and how is it doing is it healthy, is there something we need to take care of and so on and if we go ahead and take a look at it from the authorization service point of view we can also see all the different running instances so for example we can see here that the authorization service is currently running in production and staging so that is a very good result for us because that way we know ok we have a prep rod or staging environment where we can test and validate new versions before we promote or deploy to production so we have a very standardized SDLC process for our development and again as we said this service view also gives us a lot of high quality information for users who want to consume this service or learn about it so we have the README documentation and we have the architecture diagram here and this already gives us a lot of information that we know and I believe that you also know if you are watching this session that developers are missing and they really want to gain insight into this information and have that one pane of glass that provides them with all that high quality data and that's it this is the session I hope you learned a lot I hope you understand now what a data model is and what a core data model is and that this is an iterative process which you should continue to work on and improve do not be afraid to make the first step and do not be afraid to implement that core data model that you have shown here or you can talk to your developers and listen to their pain point maybe they care about something else except for the SDLC maybe they want to tackle that first so go ahead and do that and again if you have any thoughts if you have any comments feel free to contact me my email is right here and also feel free to reach out to us