 Hello and welcome to this session. Thank you for joining and today we're going to talk about how to shift mindsets from data product to data as a product. My name is Omar Salam and I'm a senior product manager with Booking.com. I have about 13 years of experience between project management and product management. Precisely about six years in product management and I've worked previously with the delivery hero and the junior group. And throughout my experience with different business domains and different kinds of products, I've always had the belief and realization that the data is the most important thing that we as product managers can always rely on to understand the behaviors of our customers and the needs of our stakeholders in general. And of course, also to justify and validate any kind of hypothesis that we're thinking about. And especially recently as digitization is growing, also the amount of data is growing exponentially with the technology getting more and more into our daily lives and our daily routines. Data that is gathered, different types of data relevant in different ways to different domains are always exponentially increasing. About 90% of the world's data that is existing right now was created or acquired within the last two years. And that implies that data, if used correctly and efficiently, can be one of the most valuable assets any organization could possess. But that doesn't just come with all the benefits. It also comes with a lot of challenges in how to manage this data, this huge amount of data. So data generally on organizational level, some in some places up till now it's still managed in silos and it's managed separately between different teams and different kinds of products and they're just maintaining it to enable how the product is working. And that causes a lot of problems and it's a challenge to manage this data because usually it's fragmented between hundreds of systems and a lot of different applications and products are using it at the same time and also contributing back to it at the same time. It's usually locked with legacy systems and legacy databases that causes a lot of fragmentation and that leads to having an unclear structure or even dark structures that nobody really knows about and ownership becomes a problem. Managing this data, making sure that it's compliant and it's well-maintained becomes a very, very big problem and as a result of that, as I said, many organizations are considered in the dark when it comes to most of the data that they own and this causes a lot of lost opportunity and misdirection in how to take decisions and how to build better features and better products in general. That's why the framework of DataMesh was introduced by Yazzamak Degani, driven by the simple idea that stakeholders within the business, internal stakeholders with the different business domains are the best people to understand their own data. So we're talking about the marketing-related data, finance-related data, any kind of specific activities that has its own relevance to data. They are the most capable and most qualified people inside an organization to handle their own data because they only understand what it takes to manage this data and how to make it more relevant to the business and also how to get the best outcome out of analyzing and processing and working with this data. And this framework of DataMesh introduced the first concept of managing data as its own product. There are four core principles for the DataMesh framework. The first one is data is managed as a product, as I said, and data ownership and architecture is something that is domain-oriented. It should be decentralized. But this concept, also these four principles, they balance between total decentralization and actual centralization. So if you're in a specific domain, you are trying to centralize the relevant data that you should own and manage from all over the organization to be centralized within your domain. And when all the different business domains inside the organization do that, the data would be decentralized, but centralized to some extent relevant to the domain. Also another principle is reaching the level of full self-service with these different domain-related data. So the optimum case would be if anyone needs to use this data or work with this data, they would be able to self-serve this data to themselves to reduce the pressure and optimize the capacity of the teams that are managing this data. Also, that doesn't mean that everyone and every domain is totally free to manage the data, however they see with any kind of standards that they decide on, but data governance is something that is very important and this should be federated somehow. So also there is this balance of what can you decide on your own as a business domain and what kind of global standards that you should be following to make sure that this data is properly served across the organization and to avoid the problems that the data mesh framework was actually introduced to solve, right? Within the context of this session, we're focusing on data as a product and how to manage your data as its own product. So first we need to understand what does data product, the normal definition means before going to the next. So basically data product is a product that is built specifically to generate value or to deliver a specific customer experience based on some data. So they are designed to extract insights or some knowledge from data to present them in a useful and actionable form to build an experience, a greater than experience for users and customers. So sometimes they involve the normal algorithms, sometimes machine learning is involved, any other advanced type of data analytics might be involved, but the main goal is to how to serve a specific user experience. There is a lot of examples for that, but for example, a customer segmentation tool that segments demographics and this kind of analytics to be provided. A predicted maintenance system, almost any kind of recommendation engines and also fraud detection systems can be examples for that and try to think on a broader concept. So like recommendations, recommendation systems based on user behaviors, based on how this data would make sense and be put together. So delivering this end user experience is the goal of a product, but it's based on some data that they are getting from, either they are managing themselves in a silo or getting it from elsewhere. Data as a product is a concept that is focused that the data itself is the actual product. And the goal here is to manage this data and work with it to extract some more exciting data and some more meaningful data out of it. So we're using data to generate some other data and the second version of the data is going to be served to other teams internally inside the organization to use it to build data products as the first definition entails. So when you're working with a data set of data and you're trying to deal with it as a product, you are trying to serve your direct customers, the internal stakeholders, but you are trying also to proxy impact to the end users or the external stakeholders that are using this final product that your internal stakeholder is building. This concept might be a little bit confusing and actually as to make things easier, usually data as a product gets transformed also to data product to make it easier to pronounce and say, but the most important thing is to understand the framework and to understand the concept and to apply it when you're trying to work with data in general or if you're trying to build a strategy or a structure for your own company. So again, to sum up this very complicated comparison or misleading comparison, data product versus a data product, data as a product. So data product is building a product that uses data to operate. It doesn't matter if the data is managed in a silo or centrally sourced or you're getting it from somewhere else externally or internally, you're just building some sort of a customer experience feeding on data. While data as a product, you get that the actual data is the product that you are trying to maintain and scale and serve to other stakeholders to use it to build data products and you're trying to produce more impactful data to proxy your impact to the end user. So in order to do that, you need to start thinking about how to apply product thinking in general on data and how to use the different frameworks of product management to manage your data and to make sure that you are cascading this multi-level impact to your internal stakeholders and proxy stakeholders, if we may call them. The most important thing at the beginning is how you define your scope and how well you define your scope. So as I'm quoting here from as a Magda Ghani, for a distributed data platform to be successful, domain data teams must apply product thinking with similar rigor to the data sets that they provide considering their data assets as their products and the rest of organizations, data scientists, ML and data engineers as their customers as we already established. It's very important to define all of these and also to distribute data ownership and the different pipeline implementations to the hand of business domains and to make sure that you have a very, very clear definition of what belongs to you and what doesn't belong to you and have a very well-defined ownership matrix that is aligned with your stakeholders of what you should own and what they should own to be able to have a crystal clear vision on what are you doing? What are you trying to build and what are you trying to achieve? This is something that is very important and there are a couple of questions that you need to ask yourself and discover and work with the different stakeholders to get answers to such as what is the data scope and what is the scope domain? What kinds of data and taxonomies that already exist and we need to source? Who needs this data and where is this data flowing from and to what are your data sources and who's consuming your data? What are the purposes that this data is serving and is there a way to make it easier to work with or to access this data in the current state and that you're starting ground? Is this data compliant or actionable or not and what kind of governance exists right now if you're also at the starting ground and how can this data be optimized to aid the internal stakeholders' needs and create a proxy impact to our external customers? The last step is like how might we statement that is usually used within design thinking and that's the next step that you should start applying. So after you've defined your ownership, you've defined your scope, you know what you're doing, what you own and what you don't own and what you want to do with it in general to like govern the whole thing, you need to start applying the concepts of design thinking to be able to brainstorm and plan and discover what is the actual potential of this data set that you are supposed to be managing and relevant to that, who are your customers, what are their problems and what are their needs and how you can serve them in the best way. So as most of you probably know, design thinking works by considering human needs and human problems, your end customers needs and problems before considering the actual solutions. We're trying to look for a problem and that's the first step of the double diamond framework as you can see. So first you're focusing on the problem space, you try to discover and define the problems and the needs that your customers have and then you start iterating and thinking about what would be the solutions to that. So you need to work with your different internal stakeholders because these are your direct customers, you need to identify them and understand what are their needs, what kind of problems are they trying to solve for their own customers. And this is the essence of how to manage a data product. Again, you're serving data to an internal stakeholder and this internal stakeholder is using this data to build a customer engaging product at some extent and in order to do that, you need to understand very well the kind of impact that they are trying to give or to build for, to be able to give them data and to give them an output of your product that would help them to do that. Design thinking, as we've been saying is a structured approach to identifying and solving problems and it has five steps that you need to go through with your at least internal stakeholders and to get access or be involved with their own research with their end users. The first one is to empathize with your customers to really understand their needs and to be able to understand the reason behind everything that they are requesting or what is the actual grounds of the needs that they have because again, as we established since this setup is domain related, so you in your domain should be the best person to know what is the actual potential of your dataset and what is the potential of what it can provide to your internal stakeholders. The classic product management thinking says that if a customer comes to you and says, I just want this, they probably don't know what they want. They want to satisfy like a very, very direct, very basic needs, but the work that you should be doing for discovery and validation and revalidation and iterating on that should help you provide like identify what are the actual problems, the root causes and provide even better solutions to totally eliminate the problems or to totally satisfy the need and even satisfy its exponential scaling. So you need to empathize with your customers and your proxy customers. You need to be able to define the scope and define who are the customers and define the different problems and then work after you identify all the problems and the risk and how the market looks and what you could, the current state and potentially what could happen and so on. You need to start ideating with your internal stakeholders and with your team, every member of the team, whether it's engineers, machine learning consultants that the scientists and so on to start ideating about how to actually solve these problems. And there's a lot of practices that can be done, just the classic product management practices. This can be another discussion for another time. And definitely you need to work with your internal stakeholders to prototype and test the outcomes of delivering this value to them and building the kind of features or specifications in the data product that you are serving to them. You need to understand the impact of what you've built and whether it's meeting the hypothesis that was given before or not because if you just deliver on requirements and that's it, you would have like a lot of missing pieces in your puzzle. You wouldn't see the full picture and you wouldn't be able to effectively iterate on that. So you need to be involved with the product launches of your internal stakeholders and you need to be very involved with the experimentation, how they are setting their own experiments and what are the results and how they are planning to iterate on it because also you could support with this iteration. After that comes the other part of how to manage a data product and also a data team, the product engineering part. There is a big part of how the dynamics of the team is managed between a product manager and the engineers in the team. And of course, I'm also sure that you already know a lot about this. I would suggest getting more into how to manage this kind of relation between product managers and engineers and the dynamics of managing such teams and the philosophy of setting this kind of team spirit and collaboration. But you need to, for the scope of this presentation, you need to be focusing on what are the core values or the core principles that needs to be delivered with any piece of your data product, any new piece that is being built or increasing the quality of what has already been done as a legacy system and so on. So data products from an engineering perspective, they need to have six main values or six main attributes. They need to be discoverable, self-describing, interoperable, addressable, trustworthy, and secure. And you could map these objectives or principles to the four main core principles for the data mesh. If you remember on one of the first slides for being, eventually you should be leading your strategy into more self-service and for data governance. So let's go a little bit more into these concepts. So discoverable means that you need to provide some sort of search capability with the data set and the data stock that you have with the taxonomy that you have. If your data could be considered as a product or qualifies to be considered as its own product, that means that you are managing a huge data set, a very, very complicated data set with a very complicated taxonomy. And you need to make it easier for different teams and stakeholders to search through that and understand what data do you actually own rather than especially in large organizations rather than communicating with each and every team. It might be tens or even hundreds of teams in your organization to try to explain all of that to them, you need to shift the responsibility and shift the ability for searching and understanding such taxonomy to them and you need to provide that to them. Also, they should be able to identify through that and through the next principle, they should be able to identify what they need and request access to it. And all of this process in a perfect world should be somehow automated or at least semi-automated to reduce the pressure and to reduce the burden on your own team and to optimize your own capacity to work on better things. Again, this data should be addressable. It should be understandable and that would lead to more productivity for your team and also for your internal stakeholders. You don't want your team or data analysts or engineers to always be just answering the questions about where is this data? What kind of data do you have? How would that be useful to me? What does it contain? How many records and so on? So your data needs really to be searchable and addressable. Of course, you need to establish trustworthiness with the quality of data that you have. So you need to establish regular or hopefully also automatic quality checks and quality audits on your data and you need to be able to automatically identify or get alerted or be aware of any discrepancies that are happening to be able to manage the data in itself, not just the data infrastructure but the quality of the data. Because as I said, you need to establish this trustworthiness with your internal stakeholders and that they can rely and build products that are fueling customer experiences with the data that you are providing and there shouldn't be any problems with the data quality as much as possible. And again, aiding to the searchability and addressability, your data structure and your data information generally should be self-describing. Again, this is mainly to minimize the effort and pressure on your teams and also to allow for more self-serviceability and to allow any internal stakeholder to be able to understand and find what they need out of this data. So stuff like data location, data mapping, example of on how to use this data samples of application. All of that is things and variables that need to be existing for your data. So it could be easily understandable and hence it could be extended and scaled to more and more use cases. And of course, it needs to be interoperable throughout the ecosystem of your organization and that mainly relates to the data governance concept for Datamesh. It should be able to be used anywhere in the organization. It should fit into any other system to be combined with any other data that is coming from other domains or that is being used on some silos for some specific teams for some experiences and so on. It should be compliant to the specific organizational and holistic governance programs that you have. And of course, your data needs to be secured. It shouldn't be automatically available for anyone because you want to prevent any kind of misuse or misprocessing for this data. You need to protect it from being unregulatory, edited or modified whether by intention or by mistake, of course. This also relies within the objective of having global or holistic governance across the organization and to be totally compliant with that. And of course, this also includes managing data access and who would be able to play with this data and tracking the usage and tracking the behavior of the different users on how they are applying this data. So to recap the key takeaways that we can get from this that depending on how data is treated it could be the most important asset or it could be the biggest lost opportunity for an organization. Also, data mesh is a framework to balance data centralization and decentralization based on business domains inside the organization. And the main topic for this presentation is the comparison between data product and data as a product and how to make sure that your mindset is at the correct place where the data product is just consuming data to fuel an experience, but data as a product, as a framework is for governing and generating more interesting data that you could provide to other teams to build data products to deliver value to the end customers for the organization. So in a way, you're kind of having a proxy impact to the direct and end customer of the organization. And as we discussed, product thinking and design thinking principles make a huge difference on how you manage data as a product. Here I added some useful reads and references if you can see them or check them to read to know more about data as a product. Of course, this is a huge topic it needs much more time to discuss but I was aiming to give you a highlight and a generic view, holistic view over this concept and I'm hoping that this would be beneficial for your day to day activities and managing data products. Thank you so much for your time. Please feel free to reach out to me on LinkedIn with any questions, discussion points. I'm always available and I hope to see you again soon. Thank you so much for your time and have a great day.