 Hello, and we're here at theCUBE's startup showcase made possible by AWS. Thanks so much for joining us today. You know, when Jamak Degani was formulating her ideas around Datamesh, she wasn't the only one thinking about decentralized data architectures. HelloFresh was going into hyper growth mode and realized that in order to support its scale, it needed to rethink how it thought about data. Like many companies that started in the early part of last decade, HelloFresh relied on a monolithic data architecture and the internal team had had concerns about its ability to support continued innovation at high velocity. The company's data team began to think about the future and work backwards from a target architecture which possessed many principles of so-called Datamesh, even though they didn't use that term specifically. The company is a strong example of an early but practical pioneer of Datamesh. Now, there are many practitioners and stakeholders involved in evolving the company's data architecture, many of whom are listed here on this slide. Two are highlighted in red are joining us today. We're really excited to welcome you to theCUBE. Clements Chi was the global senior director for data at HelloFresh and Kristof Zavada who's the global senior director of data. Also of course at HelloFresh. Folks, welcome. Thanks so much for making some time today and sharing your story. Thank you very much. Thanks Dave. All right, let's start with HelloFresh. You guys are number one in the world in your field. You deliver hundreds of millions of meals each year to many, many millions of people around the globe. You're scaling. Kristof, tell us a little bit more about your company and its vision. Yeah, should I start or Clements, maybe take over the first piece because Clements has actually been a longer trajectory at HelloFresh. Yeah, go ahead, Clements. Sure, thanks. I mean, yes, about approximately six years ago I joined HelloFresh and I didn't think about the startup I was joining with eventually IPO and just two years later, HelloFresh in public. And approximately three years and 10 months after HelloFresh was listed on the German stock exchange which was just last week, HelloFresh was included in the Ducks Germany's leading stock market index and that determined a great, great milestone. And I'm really looking forward and I'm very excited for the future for the future for HelloFresh and how is our data? The vision that we have is to become the world's leading food solution group. And there's a lot of attractive opportunities. So recently we did launch and expand in Norway, this was in July and earlier this year we launched the US brand GreenChef in the UK as well. We're committed to launch continuously different geographies in the next coming years and have a strong part by head of us with the acquisition of ready to eat companies like Factor and the US and the plant acquisition of youth foods in Australia we are diversifying our offer and are reaching even more and more untapped customer segments and increase our total addressable market. So by offering customers and growing range of different alternatives to shop food and consume meals, we are charging towards this vision and this goal to become the world's leading integrated food solutions group. Love it, you guys are on a rocket ship, you're really transforming the industry and as you expand your TAM it brings us to sort of the data as a core part of that strategy. So maybe you guys could talk a little bit about your journey as a company specifically as it relates to your data journey. I mean, you began as a startup, you had a basic architecture and like everyone you made extensive use of spreadsheets, you built a Hadoop based system that started to grow and when the company IPO you really started to explode. So maybe describe that journey from a data perspective. Yes, Dave. So Hadoop Fresh by 2015 approximately had evolved what amount a classical centralized data management setup. So we grew very organically over the years and there were a lot of very smart people around the globe really building the company and building our infrastructure. This also means that there were a small number of internal and external sources, data sources and a centralized BI team with a number of people producing different reports, different dashboards and products for our executives, for example, of our different operations teams to steer company's performance and knowledge was transferred just via talking to each other, face to face conversations and the people in the data warehouse team were considered as the data wizard or as the ETL wizard, very classical challenges. And those ETL wizard indicated the kind of like a silent knowledge of data management. So a central data warehouse team then was responsible for different type of verticals and different domains, different geographies and all this setup gave us in the beginning the flexibility to grow fast as a company in 2015. Christoph, anything you might add to that? Yes, not explicitly to that one, but as Clemens has said, right, this was kind of the setup that actually worked for us quite a while. And then in 2017, when Hello Freshman came public, the company also grew rapidly. And just to give you an idea of how that looked like, it was that the tech department itself actually increased from about 40 people to almost 300 engineers. And in the same way as the business units as Clemens has described, also grew sustainably. So we continuously launched Hello Fresh in new countries, launching brands like Everplate and also acquired other brands like Green Shepherd Factor. And with that growth, also from a data perspective, the number of data requests that essentially were getting become more and more and more and also more and more complex. So that for the team meant that they had a fairly high mental load. So they had to achieve a very, or basically get a very deep understanding about the business and also suffer a lot from this context, switching back and forth. Essentially they had to prioritize our product, requests from our physical product, digital product from the physical, sorry, from the marketing perspective and also from the central reporting teams. And in a nutshell, this was very hard for these people and this led also to a situation that let's say the solution that we have built became not really optimal. So in a nutshell, the central function became a bottleneck and slowed down over all the innovation of the company. I mean, it's a classic case, isn't it? I mean, Clements, you see, the central team becomes a bottleneck and so the lines of business, the marketing team, sales, we say, okay, we're going to take things into our own hands. And then, of course, IT and the technical team is called in later to clean up the mess. Maybe, I mean, was that, maybe I'm overstating it, but that's a common situation, isn't it? Dave, this is what's exactly happened, right? So we had a bottleneck, we had those central teams, there was always a little bit of tension. Analytics teams then started, in those business domains like marketing, supply chain, finance, HR and so on, started really to build their own data solutions. At some point you have to get the ball rolling, right? And then continue the trajectory, which means then that the data pipelines didn't meet the engineering standards and there was an increased need for maintenance and support from central teams. Hence, over time, the knowledge about those pipelines and how to maintain a particular infrastructure, for example, left the company, such that most of those data assets and data sets are turned into a huge debt with decreasing data quality, also decreasing the lack of trust, decreasing transparency and this was a increasing challenge where a majority of time was spent in meeting rooms to align on data quality, for example. Yeah, and the point you were making, Chris, talk about context switching and this is a point that Jamak makes quite often is we've contextualized our operational systems like our sales systems or our marketing systems, but not our data systems. So you're asking the data team, okay, be an expert in sales, be an expert in marketing, be an expert in logistics, be an expert in supply chain and it starts, stops, starts, stops. It's a paper cut environment and it's just not as productive, but in the flip side of that is when you think about a centralized organization, you think, hey, this is going to be a very efficient way across functional team to support the organization, but it's not necessarily the highest velocity, most effective organizational structure. Yeah, so I agree with that piece that up to a certain scale, a centralized function has a lot of advantages, right? This is clear for everyone who to go to there's some kind of expert team. However, if you see that you actually would like to accelerate that in specific in the cyber growth, right? You wanna actually have autonomy in certain teams and move the teams or let's say the data to the experts in these teams. And this, as you have mentioned, right? That increases mental load and it can either internally start splitting your team into different kind of sub teams focusing on different areas. However, well, that is then again, just adding another piece where actually collaboration needs to happen with the external sees. So why not bridging that gap immediately and actually move these teams end to end into the function themselves? So maybe just to continue was what Clements was saying and this is actually where our, so Clements my journey started to become one joint journey. So Clements was coming actually from one of these teams to build their own solutions. I was basically heading the platform team called Data Warehouse team these days. And then 2019 where basically the situation become more and more serious, I would say. So more and more people have recognized that this model does not really scale. In 2019, basically the leadership of the company came together and identified data as a key strategic asset. And what we mean by that, that if we leverage data in a proper way, it gives us a unique competitive advantage which could help us to support and actually fully automate our decision-making process across the entire value chain. So what we're trying to do now or what we're aiming for is that HelloFresh is able to build data product that's have a purpose. We're moving away from the idea that it's just a byproduct. So we have a purpose why we would like to collect this data that's a clear business need behind that. And because it's so important for the company as a business, we also want to provide them as a trust versus the asset to the rest of the organization. We say this is the best customer experience but at least in a way that users can easily discover, understand and securely access high quality data. Yeah, so and Clements, when you see Jamak's writing, you see she has the four pillars and the principles. As practitioners, you look at that and say, okay, hey, that's pretty good thinking. And then now we have to apply it. And that's where the devil meets the details. So it's the four, the decentralized data ownership, data as a product, which we'll talk about a little bit, self-serve, which you guys have spent a lot of time on and Clements, your wheelhouse, which is governance and a federated governance model. And it's almost like if you achieve the first two, then you have to solve for the second two. It almost creates a new challenges, but maybe you could talk about that a little bit as to how it relates to HelloFresh. Yes, so Chris have mentioned that we identified kind of a challenge beforehand and thought, how can we actually decentralize and actually empower the different colleagues of ours? This was more, we realized that it was more an organizational or a culture change. And this is something that Samak also mentioned, I think ThoughtWorks mentioned in one of the white papers, it's more of a organizational or a cultural impact. And we kicked off a phased reorganization of different phases. We're currently in the middle still, but we kicked off different phases of organizational reconstructing, reorganization, trying to lock this data at scale. And the idea was really moving away from ever growing complex matrix organizations or matrix setups and split between two different things. One is the value creation. So basically when people ask the question, what can we actually do? What shall we do? This is value creation and the how, which is capability building and both are equal in authority. This actually then creates a high urge and collaboration and this collaboration breaks up the different silos that were built. And of course, this also includes different needs of staffing forward teams, staffing with more, let's say data scientists or data engineers, data professionals into those business domains and hence also more capability building. Okay, go ahead, sorry. So back to Zamaq DiGani. So the idea also then crossed over when she published her papers in May 2019. And we thought, well, the four pillars that she described were around decentralized data ownership, product data as a product mindset. We have a self-service infrastructure and as you mentioned federated computational governance and this suited very much with our thinking at that point of time to reorganize the different teams. And this then led to a not only organizational restructure but also in completely new approach of how we need to manage data to data. Got it, okay. So your business is exploding, your data team has to become domain experts in too many areas. You're constantly constex switching, as we said. People started to take things into their own hands. So again, we said classic story but you didn't let it get out of control and that's important. So we actually have a picture of kind of where you're going today and it's evolved into this path. If you could bring up the picture with the elephant, here we go. So I would talk a little bit about the architecture. It doesn't show it here, the spreadsheet era but Christoph maybe you can talk about that. It does show the Hadoop monolith which exists today. I think that's in a managed hosting service but you preserve that piece of it. But if I understand it correctly, everything is evolving to the cloud. I think you're running a lot of this or all of it in AWS. You've got, everybody's got their own data sources. You've got a data hub which I think is enabled by a master catalog for discovery and all this underlying technical infrastructure that is really not the focus of this conversation today but the key here, if I understand it correctly is these domains are autonomous and not only this required technical thinking but really supportive organizational mindset which we're going to talk about today. But Christoph maybe you could address at a high level some of the architectural evolution that you guys went through. Yeah, sure. Yeah, maybe it's also good summary about the entire history. So as you have mentioned, we started in the very beginning with a monolith on the operational plane. Actually it wasn't just one monolith, both two, one for the back end and one for the front end and our analytical plane was essentially a couple of spreadsheets. And I think there's nothing wrong with spreadsheets. This allows you to store information. It allows you to transform data. It allows you to share this information. It allows you to visualize this data but all kind of without actually separating concern. Everything in one tool and this means that it's obviously not scalable, right? You reach the point where this kind of management setup or data management within one tool reach the elements. So what we have started is we created our data lake as we have seen here on our Duke. And this in the very beginning actually reflected very much our operation on this. On top of that, we used impala as a data warehouse but there was not really a distinction between what is our data warehouse and what is our data lake. So the impala was used as kind of both as the kind of engine to create a data warehouse and data lake construct itself. And this organic growth actually led to a situation as I think it's clear now that we had to centralize modelers for all the domains. There were really loose kind of modeling standards. There was no uniformity. We used actually built in-house base of building materialized views that we have used for the presentation layer. There was a lot of duplication of effort and in the end essentially there were missing feedbacks loop which helped us to improve of what we have built. So in the end, in a nutshell, as you said, the lack of trust. And this basically was a starting point for us to understand, okay, how can we move away? And there are a lot of different things that we can discuss of, apart from this organizational structure that we have said, okay, we have these three or four pillars from Jomak. However, there's also the next actually question around how do we implement our target architecture, right? What are the implications on that level? And I think that is something that we are currently still in progress. Got it. Okay, so I wonder if we could talk about, switch gears a little bit and talk about the organizational and cultural challenges that you faced. What were those conversations like? And let's dig into that a little bit. I want to get into governance as well. The conversations on the cultural chains. I mean, yes, we went through a hyper growth through the last years. And obviously there were a lot of new joiners, a lot of different very, very smart people joining the company, which then results that collaboration got a bit more difficult, of course. They have times of changes. You have different artifacts that you were recreated and the commutation that we're flying around. So we were, we had to build the company full scratch, right? Of course this then resulted always in this tension which I described before. But the most important part here is that data has always been a very important factor at HelloFesh and we collected more of this data and continued to improve, use data to improve the different key areas of our business. Even when organizational struggles, like the central organizational struggles, data somehow always helped us to grow through this kind of change, right? In the end, those decentralized teams in our local geographies started with solutions that serve the business, which was very, very important. Otherwise it wouldn't be at the place where we are today, but they did violate best practices and standards. And I always use the sport analogy, Dave. So like any sport, there are different rules and regulations that need to be followed. These rules are defined by, call it the sports association. And this is what you can think about of a data governance and data compliance team. Now we add the players to it, who need to follow those rules and buy it by them. This is what we then call data management. Now we have the different players and professionals, those need to be trained and understand the strategy and the rules before they can play. And this is what I then call data literacy. So we realized that we need to focus on helping our teams to develop those capabilities and teach the standards for how work is being done to truly drive functional excellence in the different domains. And one of our mission of our data literacy program, for example, is to really empower every employee at HelloFish, everyone, to make the right data informed decisions by providing data education that scales by role and team. And that can be different things, different things like including data capabilities with the learning path is for example, right? So help them to create and deploy data products, connecting data producers and data consumers and create a common sense and more understanding of each other's dependencies, which is important, for example, SIS, SLOs, data contracts and et cetera. People get more of a sense of ownership and responsibility. Of course, we have to define what it means. What does ownership means? What does responsibility mean? But we're teaching this to our colleagues via individual learning paths and help them upscale to use also the shared infrastructure and those self-service data applications. And overall to summarize, we're still in this progress of learning. We're still learning as well. So learning never stops at HelloFish, but we are really trying this to make it as much fun as possible. And in the end, we all know user behavior has changed through positive experience. So instead of having massive training programs over endless courses of workshops, leaving our new joiners and colleagues confused and overwhelmed, we're applying gamification, right? So split different levels of certification where our colleagues can access points that can earn badges along the way, which then simplifies the process of learning and engagement of the users. And this is what we see in surveys, for example, where our employees value this gamification approach a lot and are even competing to collect those learning path badges to become the number one on the leaderboard. I love the gamification. I mean, we've seen it work so well in so many different industries, not the least of which is crypto. So you've identified some of the process gaps that you saw. You just gloss over them. Sometimes they say pave the cow path. You didn't try to force, in other words, a new architecture into the legacy processes. You really had to rethink your approach to data management. So what did that entail? To rethink the way of data management, they found a percent. So if I take the example of revolution, industrial revolution or classical supply chain revolution, but just imagine that you have been riding a horse, for example, your whole life and suddenly you can operate a car or you suddenly receive just a complete new way of transporting assets from A to B. So we needed to establish a new set of cross-functional business processes to run faster, drive faster, more robustly and deliver data products which can be trusted and used by downstream processes and systems. Hence we had a subset of new standards and new procedures that would fall into the internal data governance and compliance sector. With internal, I'm always referring to the data operations around new things like data catalog, how to identify ownership, how to change ownership, how to certify data assets, everything around classical software development which we now apply to data. This is a new thinking, right? Deployment, versioning, QA, all the different things, ingestion policies, deletion procedures, all the things that software development has been doing, we do it now with data as well. And in simple terms, it's a whole redesign of the supply chain of our data with new procedures and new processes in asset creation, asset management and asset consumption. So data has become kind of the new development kit, if you will. I want to shift gears and talk about the notion of data product and we have a slide that we pulled from your deck and I'd like to unpack it a little bit. I'll just, if you can bring that up, I'll read it. A data product is a product whose primary objective is to leverage on data to solve customer problems where customers both internal and external. So pretty straightforward. I know you've gone much deeper in your thinking and into your organization but how do you think about that and how do you determine, for instance, who owns what? How did you get everybody to agree? I can take that one. Maybe let me start with the data product. So I think that's an ongoing debate, right? And I think the debate itself is the important piece here, right, that's within the debate, you've clarified what we actually mean by data product and what is actually the mindset. So I think just from a definition perspective, right? I think we find the common denominator that we say, okay, data product is something which is important for the company, it comes with value. What we mean by that, okay, it's a solution to a customer problem that delivers ideally maximum value to the business and yes, it leverages the power of data. And we have a couple of examples that I had a fresh year, the historical and classical ones around dashboards, for example, to monitor our error rates but also more sophisticated ways, for example, to incorporate machine learning algorithms in our recipe recommendation. However, I think the important aspect of the data product is A, there's an owner, right? There's someone accountable for making sure that the product that we are providing is actually served and is maintained and there are someone who's making sure that this actually keeps the value that we're promising. Combined with the idea of the proper documentation like a product description, right? That people understand how to use it, what is this about? And related to that piece is the idea of it is a purpose, right? We need to understand or ask ourselves, okay, why does this thing exist? Does it provide the value that we think it does? That leads then to a good understanding about the life cycle of the data product and for life cycle, what we mean, okay, from the beginning, from the creation, we need to have a good understanding, we need to collect feedback, we need to learn about that, we need to rework and actually finally also to think about, okay, when is this time to decommission that piece? So overall, I think the core of this data product is product thinking one-on-one, right? That the starting point needs to be the problem and not the solution. And this is essentially what we have seen, what was missing, what brought us to this kind of data spaghetti that we have built where in Rush, essentially, we built certain data assets, developed in isolation and continuously patched the solution just to fulfill these ad hoc requests that we got and actually without really understanding what the stakeholders needs. And the interesting piece is the results and duplication of work and this is not just frustrating and probably not the most efficient way how the company should work, but also if I built the same data assets with slightly different assumption across the company in multiple teams that leads to data inconsistency. And imagine the following scenario, you as a management for management perspective, you're asking basically a specific question and you get essentially from a couple of different teams different kind of graphs, different kind of data numbers. And in the end, you do not know which one to trust. So there's actually much more in the good end. You do not know actually is it noise what I'm observing or is it just actually, is there actually a signal that I'm looking for? And the same is if I'm running in a B test, right? I have a new feature. I would like to understand what is the business impact of this feature? I run that with a specific source. In an unfortunate scenario, your production system is actually running on a different source. You see different numbers. What you have seen in a B test is actually not what you see then in production. Typical thing then is you're asking some analytics team to actually do a deep dive to understand where the discrepancies are coming from. Worst case scenario again, there's a different kind of source. So in the end, it's a pretty frustrating scenario and it's actually a base of time of people that have to identify the good cause of this divergence. So in a nutshell, the highest degree of consistency is actually achieved if people are just reusing data's assets. And also in the meetup talk that we have given, right? We start try to establish this approach for AB testing. So we have a team which is providing or is kind of owning their target metric associated with this teams and they're providing that as a product also to other services, including the AB testing team. The AB testing team can use this information to find an interface that's okay, I'm joining this information with the metadata of an experiment. And in the end, after the assignment, after this data collection phase, you can easily add a graph to your dashboard just group by the AB testing variant. And we have seen that also in other companies. So it's not just a nice dream that we have, right? I have actually booked in other companies where we booked on search and we established a complete KPI pipeline that was computing all this information. And this information was hosted by that team and it was used for everything. AB testing, deep dives, and regular reporting, yeah. So, just one last second, the important piece now, why I'm coming back to that is that requires that we are treating this data as a product, right? If we want to have multiple people using the things that I am owning and building, we have to provide this as a trustworthy asset and in a way that it's easy for people to discover and to actually work with. Yeah, and coming back to that. So this is, to me, this is why I get so excited about Datamesh because I really do think it's the right direction for organizations. When people hear data product, they say, well, what does that mean? But then when you start to sort of define it as you did, it's using data to add value. That could be cutting costs. That could be generating revenue. It could be actually directly, you know, creating a product that you monetize. So it's sort of in the eyes of the beholder. But I think the other point that we've made is you made it earlier on too. And again, context. So when you have a centralized data team and you have all these P&L managers, a lot of times they'll question the data because they don't own it. They're like, oh, wait a minute. If it doesn't agree with their agenda, they'll attack the data. But if they own the data, then they're responsible for defending that. And that is a mindset change that's really important. And I'm curious is how you got to that ownership. Was it a top down? Was somebody providing leadership? Was it more organic, bottom up? Was it a sort of a combination? How do you decide who owned what? In other words, you know, did you get, how did you get the business to take ownership of the data? And what is owning the data actually mean? That's a very good question, Dave. I think that is one of the pieces where I think we have a lot of learnings. And basically if you ask me how we would start the thing, I think that would be the first piece where we need to start really think about how that should be approached. If you start with ownership, right? It means somehow that the team has a responsibility to host and serve the data assets to minimum acceptable standards with minimal dependencies up and downstream. The interesting piece is we're looking backwards what was happening is that under that definition, this actually process that we have to go through is not actually transferring ownership from a central team to the distributed teams, but actually in most cases to establish ownership. I make this difference because saying we have to transfer ownership actually would erroneously suggest that the data set was owned before, but this platform team, yes they had the capability to make this change on data pipelines, but actually the analytics team were always the ones who had the business understand the use cases and what they're known actually what was actually expected. So we had to go through this very lengthy process and establishing ownership of how we have done that as in the beginning, very naively. We have started, here's a document, you have all the data assets, what is probably the nearest neighbor who can actually take care of that. And then we moved it over, but the problem here is that all these things is kind of technical debt, right? It's not really properly documented, it's pretty unstable. It was built in a very inconsistent way over years and these people who have built this thing have already left the company. So it was actually not a nice thing that you wanna see and people build up a certain resistance here even if they have actually bought into this idea of domain ownership. So if you ask me these learnings, what needs to happen is first, the company needs to really understand what are our core business concept that they have. They need to have this mapping from these are the core business concept that we have. These are the domain teams who are owning this concept and then actually link that to the assets and integrate that better with both understanding how we can evolve actually the data assets, the new data build things new in this piece and in the domain, but also how can we address reduction of technical debt and stabilizing what we have already. Thank you for that, Kristoff. So I want to turn a direction here and talk, Clemens, about governance. And I know that's an area that's passionate, you're passionate about. I pulled this slide from your deck, which I kind of messed up a little bit, sorry for that, but by the way, we're going to publish a link to the full video that you guys did. So we'll share that with folks. But it's one of the most challenging aspects of DataMesh. If you're going to decentralize, you quickly realize this could be the Wild West as we talked about all over again. So how are you approaching governance? There's a lot of items on this slide that are, you know, underscore the complexity whether it's privacy, compliance, et cetera. So how did you approach this? It's about connecting those dots, right? So the aim of the data governance program is to promote the autonomy of every team while still ensuring that everybody has the right to operability. So when we want to move from the Wild West, riding horses to a civilized way of transport, that can take the example of modern street traffic. Like when all participants can maneuver independently, and as long as they follow the same rules and standards, everybody can remain compatible with each other and understand and learn from each other so we can avoid car crashes. So when I go from country to country, I do understand what the street infrastructure means. How do I drive my car? I can also read the traffic lights in the different signals. So likewise, as a business in HelloFresh, we do operate autonomously and consequently need to follow those external and internal rules and standards to set forth by the jurisdiction in which we operate. So in order to prevent a car crash, we need to at least ensure compliance with regulations to account for societies and our customers, increasing concern with data protection and privacy. So teaching and advocating this and evangelizing this to everyone in the company was a key communication strategy. And of course, I mean, I mentioned data privacy, external factors, the same goes for internal regulations and processes to help our colleagues to adapt for this very new environment. So when I mentioned before, the new way of thinking, the new way of dealing and managing data, this of course implies that we need new processes and regulations for our colleagues as well. In a nutshell, then this means that data governance provides a framework for managing our people, the processes and technology and culture around our data traffic. And those components must come together in order to have this effective program, providing at least a common denominator is especially critical for shared data as such, which we have across our different geographies. Managed and shared applications on shared infrastructure and applications and is then consumed by centralized processes. For example, master data, everything and all the metrics and KPIs which are also used for a central steering. It's a big change Dave, right? And our ultimate goal is to have this non-invasive federated automated and computational governance. And for that, we can't just talk about it. We actually have to go deep and use case by use case and POC by POC and generate learnings and learnings with the different teams. And this would be a classical approach of identifying the target structure, the target status, match it with the current status by identifying together with the business teams with the different domains have a risk assessment, for example, to increase transparency because a lot of teams, they might not even know what kind of situation they might be. And this is where this training and this piece of data literacy comes into place where we go in and create based on the findings, based on the most valuable use case and based on that help our teams to do this change to increase their capability, just a little bit more. I wouldn't say hand-holding, but a lot of guidance. Clemens, can I try them in quickly? Dave, if you allow me. I mean, there's a lot of governance piece, right? I think that is important. And if you're talking about documentation, for example, yes, we can go from team to team and tell these people, hey, you have to document your data assets and data catalog, or you have to establish a data contract and so on and forth. But if you would like to build data products at scale following actual governance, we need to think about automation, right? We need to think about a lot of things that we can learn from engineering before. And this starts with simple things like if we would like to build up trust in our data products, right? And actually want to apply the same rigor and the best practices that we know from engineering, there are things that we can do and we should probably think about what we can copy. And one example might be service level agreements, service level objectives, service level indicators, right? That represent on an engineering level, right? If you're providing services, they're representing the promises we make to our customer and to our consumers. These are the internal objectives that help us to keep those promises and actually these are the way of how we are tracking ourselves, how we are doing. And this is just one example where I think the federated governance comes into play, right? In an ideal world, you should not just talk about data as a product, but also data product as code, that we say, okay, as much as possible, right? Give the engineers the tool that they are familiar with and actually not ask the product managers, for example, to document their data assets in the data catalog, but make it part of the configuration. Have this as a CDCI continuous delivery pipeline as we typically see in other engineering tasks through in services, where we say, okay, there is configuration, we can think about PII, we can think about data quality monitoring, we can think about the ingestion in data catalog and so on and forth, where I think ideally in a data product world, we come up with certain templates that can be deployed and are actually rejected or verified at full time before we actually make them deployed up to production. Yeah, so it's like DevOps for data product. So I'm envisioning almost a three-phase approach to governance, and it sounds like you're in the early phase, we call it phase zero, where there's learning, there's literacy, there's training, education, there's kind of self-governance, and then there's some kind of oversight, some, a lot of manual stuff going on, and then you're trying to process builders at this phase, and then you codify it, and then you can automate it. Is that fair? Yeah, I would rather think about automation as early as possible in a way, and yes, there needs to be certain rules, but then actually start actually use case by use case. Is there anything that a small piece that we can already automate? Is this possible, roll that out and then actually extend it step by step? Is there a role though that adjudicates that? Is there a central, Chief State Officer who's responsible for making sure people are complying or is it, how do you handle that? I mean from a platform perspective, yes, we have a centralized team to implement certain pieces that we think are important and actually would like to implement. However, that is actually working very closely with the governance department, so it's Clement's piece to understand and defy the policies that needs to be implemented. So Clement, essentially it's your responsibility to make sure that the policy is being followed, and then as you were saying, Christoph, try to compress the time to automation as fast as possible. Is that, is that fair? Yeah, awesome. So it's a really, it's a, what needs to be really clear is that it's always a split effort, right? So you can't just do one thing or the other thing, but everything really goes hand in hand because for the right automation, for the right engineering, tooling, we need to have the transparency first. I mean, code needs to be coded. So we kind of need to operate on the same level with the right understanding. So there's actually two things that are important, which is one, it's policies and guidelines, but not only that, because more importantly, or even equally important, is to align with the end user and tech teams and engineering and really bridge between business value, business teams and the engineering teams. Got it. So just a couple more questions, because we got a wrap. I want to talk a little bit about the business outcome. I know it's hard to quantify and I'll talk about that in a moment, but major learnings, we've got some of the challenges that you cited. I'll just put them up here. We don't have to go detailed into this, but I just wanted to share with some folks. But my question, I mean, this is the advice for your peers question. If you had to do it differently, if you had a do-over or a mulligan, as we like to say for you golfers, what would you do differently? I can start with from the transformational challenge that understanding that it's also a high load of cultural change. I think this is important that a particular communication strategy needs to be put into place and people really need to be supported. So it's not that we go in and say, well, we have to change into a towards a data mesh, but naturally it's in the human nature, we are kind of resistance to change. Of course, people are uncomfortable. So we need to take that away by training and by communicating. Chris, would I want to add something to that? Definitely. I think the points that I've also made before, we need to acknowledge that data mesh is an architecture of scale. You're looking for something which is necessary by huge companies who are vulnerable data products at scale. I mean, Dave, you mentioned that, right? There are a lot of advantages to have a centralized team, but at some point it may make sense to actually decentralize here. And at this point, right? If you think about data mesh, you have to recognize that you're not building something on a green field. And I think there's a big learning which is also reflected here on the slide is down under as you made your baggage. It's typically, you come to a point where the old model doesn't broke anymore and has had a fresh, right? We lost the trust in our data and actually we have seen certain risks that we are slowing down our innovation. So we triggered that, this was triggering the need to actually change something. So this transition applies that you typically have a lot of technical depth accumulated over years. And I think what we have learned is that potentially we have decentralized some assets too early. This is not actually taking into account the maturity of the team where we are actually distributing to. And now we'll be actually in the face of correcting pieces of that one, right? But I think if you start from scratch, you have to understand, okay, are my teams actually ready for taking on this new capability? And you have to make sure that this is decentralization. You build up these capabilities in the teams. And as Clemens has mentioned, right? Make sure that you take the people on your journey. I think these are the pieces that also here comes with this knowledge gap, right? That we need to think about hiring and literacy, the technical depth I just talked about. And I think the last piece that I would add now which is not here on the slide deck is also from our perspective, we started on the analytical layer because this is kind of where things are exploding. This is the thing where people feel the pain. But I think a lot of the efforts that we have started to actually modernize the current state and towards data product, towards data mesh, we have understood that it always comes down basically to a proper shape of our operational plan. And I think what needs to happen is, I think we got through a lot of pains, but the learning here is, this needs to really be a commitment from the company and needs to happen end to end. I think that last point you made is so critical because I hear a lot from the vendor community about how they're going to make analytics better. And that's not unimportant, but true data product thinking and decentralized data organizations really have to operationalize in order to scale it. So these decisions around data architecture and organization, they're fundamental in lasting. It's not necessarily about an individual project ROI, they're going to be projects, sub-projects within this architecture, but the architectural decision itself is organizational, it's cultural, and what's the best approach to support your business at scale? It really speaks to what you are, who you are as a company, how you operate, and getting that right as we've seen in the success of data driven companies is yields tremendous results. So I'll ask each of you to give us your final thoughts and then we'll wrap. Maybe Christoph. Can I quickly, maybe just jumping on this piece what you have mentioned, right? The target architecture, we talk about these pieces, right? People often have this picture of mind like, okay, there are different kinds of stages, we have sources, we have actually a gestural layer, we have a storage layer transformation and presentation layer, and then we are basically putting a lot of technology on top of that, it's kind of our target architecture. However, I think what we nearly need to make sure is that we have these different kind of views, right? We need to understand what are actually the capabilities that we need in our new goals, how does it look and feel from the different kind of personas, so an experience view. And then finally, that should actually go to the target architecture from a technical perspective. And maybe just to give an outlook what we are planning to do, how we wanna move that forward, we have actually based on our strategy in the sense of we would like to increase data maturity as a whole across the entire company. And this is kind of a framework around the business strategy and it's breaking down into four pillars. There's a lot of people, meaning the data culture, data literacy, data organizational structure. And so on, that we're talking about governance as Clemens has actually mentioned that, right? Compliance, governance, data management. And so on, we're talking about technology and I think we could talk for hours for that one. It's around data platform, data science platform. And then finally also about enablements through data, meaning we need to understand data quality, data accessibility and applied science and data monetization. Great. Thank you, Christoph. Clemens, why don't you bring us home? Give us your final thoughts. I can't, I can just agree with Christoph that important is to understand what kind of maturity people have, right? Understand what at the maturity level where a company, where people, the organization is and really understand what does kind of such kind of a change implies to that, those four pillars, for example, what needs to be taken first. And this is not very clear from the very first beginning because it's kind of like green field. You come up with must wins to come up with things that we really want to do out of theory and out of different white papers. Only if you really start conducting the first initiatives, you do understand where we have to put those dots together and where do I miss out on one of those four different pillars, people, process, technology and governance, right? And then that kind of an iteration at going step by step, small steps by small steps, without boiling the ocean, where you're capable really to identify the gaps and see where either you can fill the gaps or where you have to increase maturity first and train people or increase your tech stack. You know, Hello, President, excellent example of a company that is innovating. It was not born in Silicon Valley, which I love. It's a global company. And I got to ask you guys, it seems like it's an amazing place to work. Are you guys hiring? Yes, definitely we do. As mentioned, right, this was one of these aspects, distributing and actually we are hiring as an entire company, specifically for data. I think there are a lot of open roles here. Yes, please visit our page from data engineering, data product management and Clemens has a lot of roles that he can speak about, but yes. Guys, thanks so much for sharing with the CUBE audience, your pioneers and we look forward to collaborations in the future to track progress and really want to thank you for your time. Thank you very much. Thank you very much Dave. And thank you for watching the CUBE's startup showcase made possible by AWS. This is Dave Vellante. We'll see you next time.