 Welcome everyone to the webinar agile data veros or business intelligence addressing the hard problems by Scott Ammler. So without further delay, over to you Scott. Okay, thank you very much. Hello. Good day everybody or good evening. I suspect most for most of you. My name is Scott Ammler. I'm in Toronto, Canada. So I've, it's a little bit about me. I'm the consulting methodologist for ambisoft I help organizations around the world to understand and apply agile and agile data techniques, and basically to scale and to address the hard problems that organizations are running into in particularly in the data space. So I've done a lot of work over the years in this space. I'm going to share some ideas today with you. All proven techniques. I will not share any theory. If I do share theory, I will be very clear about it when I do. But I probably won't. So anyways, yeah, I'm the person behind the agile data method and as well as the agile modeling method and along with mark lines the co creator of the discipline of PMI is dispensable toolkit. So what do I want, what do I want to cover today. First thing to start with, you know, being data being agile data, you know, agile data ways of thinking. And then I'm going to start I'll define what I'm talking about. You know what is this agile DW BI stuff, and then I'm going to focus on challenges that we're facing. And the solutions to those challenges, it's not just about, you know, complaining about the problem we actually have to fix the problems. And then I'll hopefully some time for q amp a. We're also going to do, I guess as everybody else is doing will also do a breakout session afterwards to discuss questions as well. And if it ends up that, you know, as you're typing questions into the chat. If we run out of time to answer questions, which we more than likely will. I'm also happy to write up a blog or something and then we'll we'll share that after the after the conference. So, agile days, agile data ways of thinking how can, how should we approach agile data work or agile database work. So the agile data methods is described as a collection of philosophies, the, you know, the heart of the agile data method is a collection of philosophies. The first one is to look beyond data. So, many data professionals like to say that data is the lifeblood of organizations and that's true I truly believe that, but there's more to people than just lifeblood. When you only have lifeblood, you have crime scene. So, the, because there's also flesh and a skeleton and skin and hair and other good things so we need to look beyond data so data is important. But it's only one of many important issues and the reason why this is important is because when you only consider data you're only looking at part of the picture and you'll locally optimize your designs and your architectures and your approach, and then you'll miss usage you'll miss, you know, the security perhaps and other important aspects so we really need to look at the full picture not just data. So we need to collaborate closely with our stakeholders with other it professionals with everybody, and we need to work in an evolutionary manner when we do so. I think it's going to be quality infected I'm going to talk about data quality a fair bit today, and frankly the data community has fallen down on data community on data quality. And I think it's in many ways because of the traditional mindset that we see pervading the data community still they've missed basic techniques that you know techniques that the agile community takes for granted. What we've been doing for years like CI and CD and automated testing and refactoring. So basic fundamental strategies have been missed for the most part by the data community. And it's because of the traditional blinders that they tend to wear. This is and that's one of the reasons why the agile data ways of thinking are important. So we need to embrace evolution. We need to get it right the first time we need to do architecture do design do requirements, all the way through the lifecycle continuously not upfront. I'll talk about that for a bit as well. We need to be enterprise aware we need to do what's right for the organization not just what's convenient for our project team. But for purpose approach, every team is different every organization is different every person is different. One size does not fit all so be very careful about adopting these prescriptive frameworks that promise you including some of the more flexible ones like scrum that you know promise you the world. So in that case and frankly, data warehousing teams are in a much different situation than application teams. So we need to be aware of that and act accordingly. And everybody should be agile, not just not just development teams. So we need to go, you know, we need to we need to apply agility across the organization, including within our data management group. So let's talk about DW and bi for a bit. So agile DW bi is the act of providing quality information in a collaborative and evolution a matter and I'm highlighting three words there quality. We have to have high quality data and high quality information that's being produced by the data team for for our stakeholders. So it's interesting. I didn't get to attend many of the talks at the conference this week, but I have to think that many of them at least the talks that went beyond software development. We're talking about data driven decision making and data or at least data informed decision making, which is a great thing. But there's an assumption that we can deliver the data the information that people need the right information at the right time to the right extent in the right manner, we need quality. And this once again is where the data community tends to fall apart. So, and we need so we need to be collaborative, because our stakeholders needs change. I'm sure, you know, many of the talks you attended they were talking about VUCA and the rate of change and all that good sort of stuff. Yes, absolutely. So that means the data folks need to work in a similar manner. We can't go off for months at a time and to build things. The time frame needs to be days and hours, not weeks and months and years. So, just like everybody else. So we need to step up. So what are some of the hard problems this is all you know I've been preaching for a bit so what are the, what are the challenges that we face and doing this because it's frankly easier said than done sometimes. I put together this talk actually for a data conference a few weeks ago now. And when I started listing the hard problems I came up with about 20. So, I'm going to focus on what I believe to be the top nine. And, but you know if there are other challenges that you face please put them in the chat and hopefully we'll be able to get to them. And then like I said I'm going to write this up for you. And I'll share that as shared as a blog or article more than likely. But anyways, let's let's start talking about these what let's work through these nine problems one at a time. And they're not in any sort of order, other than, you know, my mood at the time when I put the deck together. So, look at this. Okay so anyways problem up front architecture and you know traditional approaches to data architecture is really the, the challenge here the desire to do all this up front thinking. So a couple of points that I want to make here part of the solution so first, I highly suggest I have no skin in this game by the way. But I highly suggest looking at data vault to if you are doing. If you're a data warehouse and professional. If you are involved with building a data warehouse. I highly suggest you look at data to the group of people behind this, particularly Dan lindsted but certainly others as well are practitioners they have been out in the field, working in very difficult situations, Bill in month by the way is one of the father of data warehousing is a big fan of dv2 and has been actively involved with this community. And they've done a lot of architecture work for you for at scale. You know, situations where you're dealing with data at scale, incredible volumes of data like really really really big data at scale and fast. So, just an incredible amount of thinking that it's been gone behind this methodology and this architecture and the architecture and design patterns of data vault to so. You ignore those patterns at your peril. I've seen some organizations where they think they're special they think they can rethink some of the patterns, it always goes poorly. One of the reasons why I like data vault to is this is all really solidly thought out and agile stuff that they're doing really really good. Just do it your toll. I mean this this is one of the this is actually to be clear. I am not a friend I am not a fan at all of prescriptive methods. I think they're almost always a bad idea data vault to is pretty much the only thing I found where I would follow follow their advice to the letter and believe me, that is completely unusual for me. I'm not a big fan of that but in the case of data vault to just do what you're told really do what you're told when I'm involved with teams that are doing data warehousing. One of the things I want to do right away is what I call a data source diagram or a Milky Way diagram I want to know all the system I want to identify all the potential data sources that I'm probably going to be interacting with, and some basic data about them I don't need details, but I do need to know, you know, hey, the NC 1701 database is something that we're probably going to have to work with at some point time. Here's the type of data in it. Here's so I talked to who's who's here. This is the group I need to collaborate with to get access and to work with their data, and this is how we're going to do it. You know, it's a data fee is the this type of data feed type of thing. I don't need a lot of details but I do need to get an overview I need to understand the landscape of what I'm dealing with. I also want to start doing some conceptual modeling. I want to identify the high level entity types and the and where they come from and I want to start mapping. So where am I getting person data from and student data from and so on. I don't need any details I don't need to know that there's 50 data elements behind students. That's the details I can get later. You know, if I if I have the capability to identify the details now, I certainly have the ability to do so, you know, three months from now when I actually need the data. There's no rush for that, but I do need to understand the landscape I need a high level overview to identify this is what I'm dealing with so I need the physical overview. And that's what the data source diagrams all about and I need a logical overview as well. So that way I can start getting into what we're doing, and then the details will come out over time as we implement vertical slices. So my models need to be just barely good enough. Now, many traditional people will thrash on this concept, but the the concept is solid, just barely good enough is the most effective place you can be. Because if something is more than good enough if you keep investing, you're working you know something gets to the point where it's good enough and then you keep working on it. You're wasting your time is throwing money away. Right. It just needs to be good enough for the situation that you face. So the challenge, of course, is how do you know when it's just good enough because it this is situational. There's no hard and fast rules which also blows the minds the traditional folks. You know, so if you want to be just told what to do. Well, in the modeling world. So anyways, what are the factors you need to be concerned about. Well, the things that motivate me to do more modeling to invest more in an artifact. If I'm in a high risk situation, if I'm dealing with complexity, or if there's, I'm being pushed for some some sort of desire for predictability, you know, I'm dealing with executives who still believe in on time and on budget and all this just nonsense of the management world. The reality is you got to deal with that and sometimes you have to increase the risk and waste money in order to make it look as if you're working in a predictable manner. But unfortunately, as we all know, our stakeholders change their minds the situation changes. So any sort of predictability around budget and time is delusional at best. But anyways, don't get me going on that topic. But what motivates me to do less modeling. If I'm if I'm working with highly skilled people, people who can gather the details later on, then I don't need to do as much modeling if I have good access to my stakeholders, the people who can answer the questions. If it's easy to change if I know how to refactor my databases and it's easy to change whatever it is that I'm modeling. Then there's less need for me to think it through up front. This is one of the reasons why traditionalists think they need to model everything to the nth degree at the beginning of project is because they don't know how to evolve a database effectively. They believe it's hard and that's completely false which I'm going to I'll talk about that later. But if there's great uncertainty, I want to do less modeling, because it's going to change. So I'd be absolutely foolish to invest any time modeling something that I know is going to change. It's just a suckers bat. And if I have a very collaborative way of working. I also need to do less modeling. So anyway, so this I think will be a this slide is probably of interesting interesting value to you. I should have also mentioned that these slides are available. You can download them off the agile India site, but they're also available on slide share net. So anyways, you don't need to madly write this stuff. How do we deliver value every sprint so every week or every two weeks. Well you do that by a vertical slicing. Now vertical slicing is is one of the standard techniques in the agile world, but it's a fairly new concept in the data world. Vertical slicing can be difficult without other techniques in the in the agile database technique stacks such as CI CD and automated testing and all good sort of stuff things I'm going to talk about in a bit. But the idea is that we deliver a little bit of value every every two weeks every sprint. Now at the very beginning of a DW project, it might be pretty slim. You know, you might only be able to get one data element because the reality is, you know, or, you know, one or, you know, a few data elements on the very in the very first, on the very first sprint because there's a heck of a lot of setup work to get data out of one data source into your warehouse, you know, do the etl stuff if you've still got to do that. And then finally into into a data mark or you know whatever it is, whatever it is that you're doing, you know, just being able to get a few elements in the first two weeks might be might be hard pressed. But if I'm several months into the effort, you darn well better believe I should be able to deliver full reports and full reporting views and, you know, whatever it is you're doing to get the information into the hands of your stakeholders. I should be able to deliver true value every two weeks and we'll talk about how to do that in a bit. So how do we how do we vertically slice the work for data warehouses. The answer is not user stories. So, or it sort of is you what you really need is a specialization of user stories something I call question stories, because the goal of the data warehouse is to help people to answer questions to make data informed decisions. So that means we are answering questions we're helping people to answer the questions they have to address their real needs. So our requirements artifacts should reflect that. So a question story is very similar to a user story it's a special specialization. So as a role I want to know something by a certain almost always by a certain timeframe because of a reason so very similar to a story, but nuanced. Right, so the nature of data warehousing is different than nature of app development. So you need to change the ways, you know, the ways that you approach requirements in this case. So anyways, there's a fair bit of a write up on the on the agile data site about this technique. Another challenge that we face. And this is this is this is hard can be hard I should say we need to install the right infrastructure for our data warehousing so if you're on the first release of data warehouse. If you, you know you get some upfront work to do right now so you know if you're working the data warehouses been around for a few years, then yeah it's pretty you know the infrastructure is all in place don't worry about that but if we're at the beginning of our data warehousing journey, then we might have a fair bit of upfront work to do. So, the first thing is embrace evolution, this is part of the mindset, recognize that we don't need to do everything upfront. When I'm coaching data warehousing teams. And most of my time, basically just yanking back the traditional folks from from the precipice and saying well no no we don't need to do that just stop it, you need to stop. Just, you know we don't need all those details we don't need it now it's going to change anyways, stop that, you know, except the fact that you're going to evolve you know whatever you deliver is going to evolve. And so make it easy to evolve and focus on you know except the fact you're going to be doing this and act accordingly. So, what I tell people is we need to think about the future this is why I do some upfront modeling and you saw that earlier, but we need to wait to act. We fill out the details later when we need them we implement things when we need it. So moving up front is not the way to go. Now having said that, sometimes the first release in a production is a bit bigger than you'd normally prefer. So earlier I was talking about delivering vertical slices, and you know talking about how you know maybe the first first few vertical slices are pretty slim you know it's a few data elements that have a single data source. Yes, you can do that but who cares. One or two data elements whoop the freaking do. Yes, it's not enough, right, so we might actually have to deliver many vertical slices we so we might have to do, you know, three months for the work for the first release, because we've got some basic infrastructure work to do and that that just takes time, because we just need to do the work to get the data elements out of multiple sources and combine them and clean them up and then put them in our data markets, for example. So a little bit of work to be done until you get enough value that's worth releasing. Now, having said that you can do a lot less work than the traditional people think, but you still have to do some work, you know more work than the actual people would probably prefer. So, you know you might have to do, like I said, you know three or four months worth of solid work until you've got enough value that's up and running that's worth releasing in a production. But then as soon as they get to that point, get it out the door to get it, you know, start help helping people to make better decisions with the data that you can provide them as soon as possible, because you need the real world feedback that you're going to get about what you're actually delivering and you need to act on that feedback and then deliver more vertical slices on a regular basis. There's also another serious problem, just a lack of people, unfortunately. Just a lot of this is this is a huge challenge I've had lots of interesting conversations with organizations that are actively trying to do agile data warehousing, and they can't find coaches. They can't find people experienced in, you know, like I said, some of the basic techniques that all the agile people take for granted like automated regression testing of the databases in this case of database refactoring of many other techniques. So what do we do. Well, the agile data method calls out several roles, all of which are important. Some of them are easy to almost easy to find, you know, there are some good data scientists out there. There are some good data analysts, although many of them struggle or struggle to be agile, but at least they've got the fundamental skills they need and then we can maybe pair them up with agile people. Not as many agile data engineers as we'd like, I'm just do the lack of agile data, you know, agile data practice experience in many cases, you know, they're still traditionalists, and so on. So anyways, so the good news is the roles and responsibilities are defined, but we still need to fill them out. Now, one of the challenges and one of the things I don't like about what I've done here with the agile data roles is they are specialists and I'm not a big fan of specialists. So what we need to do is take the specialists that we have and help them become generalized and specialists. And what I mean by that is have them help them become more T skilled or, you know, it's more of a cross skilled thing, right. It's not just like one specialty and general knowledge, it's one or more specialties, plus general knowledge of what we're doing. So the help these specialists become more effective. The challenge with specialists is even though they've got deep skills and can add value, they tend to do their job, even when it doesn't need to be done. So they do more work than they need. And they also repeat the work of other specialists. So what we want so what happens is when you have a wider range of skills, you have a better sense of when to do a certain technique and when not to and the ability to at least be involved in it. At the other end of the spectrum, though, are generalists and so they have a very broad, you know, they have very broad but shallow knowledge so they can lead, perhaps they can lead and manage, but they often can't do. So what we really need is the best of both worlds, which is what a generalizing specialist is have a broad knowledge of what's going on so that way you can interact and collaborate well with others, but also have a one or more specialty so you can actually be useful and do something of value so we really need something in between here. So anyways, I'm probably preaching to the choir you know this is a very common concept in the agile community, not as much in the data community. So, just be aware of that. And of course invest in your people so get them training get them coaching, get them, you know, particularly get them doing non solo work get them pairing up or working in triads groups of three or even mobbing. You know, when you do collaborative non solo work. This is when you learn you this is when you pick up skills from other people and start and sharing your own skills, of course, absolutely critical to your success. Another problem now this is a bit of heresy for the scrum folks among us sprints are too freakin slow. The challenge here is okay so let's go through a scenario so why am I saying this, let's be fair. So, here we are, you know so say we're at work tomorrow, or on Monday morning, and somebody comes us comes to say hey I need a new report or I need something changed in this report or I need a new piece of data. And, you know, you're working on a reasonable, you know, semi mature data warehouse at least, and you think yes, three or four hours of the work to get you access to that new element or to write that new report up and test and release it right three to four hours of the work, no problem. So what I'm going to do is I'm going to you know we'll we'll write up that question story for it, and we'll put it on our backlog. And because it's so high priority and because we really like you will do it next sprint. But we're already one week into our two week existing sprint. So we'll do it next, we'll do it next sprint and we'll release their production. So here we are on Monday. And I'm telling the person, I'll get three to four hours of the work delivered three weeks from now. That's crazy. Right. Can you imagine being the business person hearing that nonsense. And even even if my answer was, I'll get that three to four weeks, weeks or days with a work done and deliver, and I'll do it this I'll fit it in this sprint somehow. And we'll release it on Friday. It's still a week to get a couple hours with a work done. Also, doesn't sound great. So we can do a lot better. So what my advice here is, is that, yeah, I might work as, you know, my first release of a data warehouse is probably going to be organized as an agile project. And so working in sprints to get the first release out that first big chunk of bread. That makes a lot of sense. Because you've got to, you know, you do have some infrastructure to put in place and you do have to do a little bit of work to get enough value to make it worth your while to release and all that sort of stuff. So working in a scrum based agile manner, more than likely as a project because most of our organizations are still doing projects, even though they probably shouldn't be. But anyways, you know, we'll organize it as a project team. But then once we get that first release out. I very, very quickly want to pivot my team and have more of a lean continuous delivery approach, where we're following a calm bond based approach where, you know, we pull, we pull a requirement, you know, a single piece of work into the team. One at a time, do it and move on. Right. So we have this continuous delivery. So if I do have this high priority, three hour chunk of work coming in on Monday morning. Maybe we could actually get it done on Monday and released on Monday afternoon or Monday evening, because it's really high priority, you know, big vice president or something and they, we've got to get this work done. So this is the thing. So, so there's two major changes here. First, we're moving away from a project based paradigm to a product or a continuous delivery type of a paradigm. And we're moving away from an agile approach, almost always based on scrum to leaner approach almost always based on con bond. And this allows us to serve our stakeholders well. Now having said that it implies we have the skills and the environment to work in this more advanced manner. So it might take you a while to evolve away from scratch scrum based project approach into a lean continuous delivery approach. This is one of the challenges that we face as coaches and this frankly is a general issue, right, you know, for the agile people on the call. This is probably not news to you. So, you know, of course, we've been doing this for years with our other teams. Well, it's the same thing with your data warehousing team. And of course, we also need to be realistic sometimes, you know, like, you know, having said that so just because something is high priority for this one end user doesn't mean that it's high priority for everybody else. So yes, you know, you desperately want your, your little report done. But you know what, I got a lot of other far more important things to do first. And it's going to be a few weeks until we get around to it just because you just it's not as high priorities think. Another strategy and this is almost always part of the evolution to a lean continuous delivery approach is shorten your sprints. Don't lengthen them, never lengthen them. The loss will always have excuses for why they need longer sprints. They're special. Everything they do is different. The rules don't apply to them as just ignorant nonsense. And frankly, it shows a lack of coaching and lack of training in most cases. So yeah, if your sprints are longer than two weeks, you've got a serious problem. And frankly, I would be looking to shorten them to at least a week. I've seen a lot of actual actually I've seen a lot of data warehousing teams get to the point where they're still working in sprints. But it's a one week sprint and it's not so bad. Yeah, really, you know, telling people yeah, you know, we'll throw it in what we release at the end of the week. That's usually pretty acceptable in most cases. So a one week sprint is something that I've seen a lot of teams live with. But yeah, like I said, I would prefer a lean continuous delivery approach where we're releasing several times a day, you know, it's ready to go release it. Because I want to, I want to support my stakeholders as best I can. Now here's a big problem and this is probably, you know, several of you that the naysayers among us and I love those people, the skeptics. So, well, wait a minute. Yeah, yeah, a bunch of nonsense. So the reason why our sprints are so long is because it takes more than a sprint, it takes more than two weeks to do the data analytics to get to the point. Okay. Yep, fair enough. Yeah, welcome to reality. That's the way it is right sometimes you've got some very hard. You got some very complex data on the back end or you got multiple sources you get stuff that you don't understand or you're, you're trying to answer really hard questions. I've worked on teams worked on one team a few years ago, where on average we need to do six to eight weeks of data analysis in order to understand their requirement to understand the data sources to answer the questions that we were being asked six to eight weeks on average. You know, sometimes a little bit less sometimes a little bit more but generally six to eight weeks. I worked on another team where it was also pretty much always six weeks. They would do a push if it was taking more than six weeks they would do around week four they would do a push to get it done in six weeks but anyways call it six weeks. And that was just because the nature of what they were doing was phenomenal complex and it took some very smart people almost all of them had PhDs six weeks to figure it out. So the implication of that is to do look ahead data analysis to do, sometimes people call this refinement or grooming. Not really, but anyways, use whatever terms you want, but it's really look at data analysis so I've got to have a few, you know one or more people with the data analysis skills, working ahead of the rest of the team to do the data analysis for the question stories that we tend to implement in an upcoming sprint. So in this case this example, you know, even though we're in sprint at the beginning sprint six right now. We know we've got we believe we've got to implement three question stories in sprint nine almost two months from now. So we need to start in some cases at least for question story be we need to start to look ahead analysis now. The other two stories are a little easier so that they require less analysis so we can start them in sprint seven and sprint eight, as you can see, but we need to invest the time to do the analysis that we need to do. The challenge with this of course is there's a lot of a lot of planning overhead and things change. And also it's not about just about sprint nine because we had the same problem for whatever we're going to implement in sprint eight and sprint seven and sprint six and so on. So our analysts tend to become a bottleneck really quickly. As a result so that's one another reason why you want to avoid having data analysis experts or specialists on your team, and you really want to get data analysis skills across all team members. So that way we can even out the work a bit better. Now, the management overhead goes away in the continuous delivery space you still need to do the now you still need to do the look at analysis there's you're not going to get away from that. But at least now when we're pulling a chunk of work, one at a time into our process into our team, then I'm really, you know, I can shift my mindset, and I can say well wait a minute, the analysis is just part of the overall development work. So if I pull the work into the team. That's when I kick off if the analysis is the first thing I do which it almost always is, then it's just part of the overall development work so if it takes, you know, there's like, you know, four or five weeks of data analysis and then a, you know, a couple days of implementation work. Okay, fine. It is what it is, right. So the management overhead goes away, which is one of the, you know, big parts of the religious debates of, you know, the scrum versus Kanban debate. And it's absolutely true. So, you know, fair enough, right. So, so from a management point of view, the lean continuous delivery approach works a lot better. Just squeezes the bureaucracy out just by the nature of the way you're working. Another common problem. We're almost, we're almost through all the challenges. My product owner doesn't understand the data. Well, okay, fair enough. The data is pretty complicated. So what do we do. Well, my product owner needs to collaborate closely with the other data professionals so the product owner might have to work with the data scientists with the data analysts very closely so and then they they start picking up and by doing that they start picking up the skills and once again, we need generalizing specialist. A specialized product owner is probably not going to get the job done. They need to understand the domain that they're working in. And to get that understanding, it's going to take time in many cases. And, you know, fair enough. Fair enough. So, involve have active stakeholder participation. This is a technique from the agile modeling method it's actually from extreme programming extreme programming has a practice called on site customer. And the idea and agile modeling took it one step further and said, not only should you have your customers right with you to answer questions right away. They know the business far better than you do. So put them to work. Get them actively involved in the modeling get them actively involved in the data analysis and to explore their own data. And that might requires a little bit of coaching and training as well, but it also requires you to adopt simple techniques like you know post it knows and whiteboards and sketches on whiteboards and stuff like that. And it's some of the low code no code development tools as well. So there's a lot of opportunity for active stakeholder participation, particularly if you want to do self serve BI self serve business intelligence. The only way you're going to pull that off is if you involve your stakeholders and coach them and train them in in using these really great tools, which are also development tools in many cases. So and of course you need to train and coach them and you know, like I said move help them move towards being generalized and specialist or costly learning constantly getting better constantly collaborating with others. We also have another problem too much data technical debt. Here's a picture of fish swimming in the ocean and they're swimming amongst garbage. This is the reality in most organizations we have significant tech well significant technical debt, but we have significant data technical debt as well. And this is a bit of a blind spot for the for the agile community frankly we always talk about code technical debt and architecture technical debt. We don't have as many conversations about data technical debt. And frankly the data technical debt is a bigger problem to solve. And it's a nasty problem in many cases. So, what's data technical debt, well, technical debt is just poor quality, poor quality stuff data technical debt is poor quality data, and it slows this down is expensive. It's absolutely devastating for all these techniques and strategies where we talk about making data informed decisions. So, you know a lot of the stuff around value streams and product management where, you know we're going to make better decisions based on our incoming data. Well if our incoming data is poor quality we're going to make poor quality decisions because we have poor quality information garbage in garbage out. If we're doing artificial intelligence. It's a showstopper. And once again garbage in garbage out. So we need to be quality infected, we need to stop tolerating poor quality and we need to fix the data at the source. And this is what database refactoring is all about. I co wrote a book a few years ago about 15 years ago now called refactoring databases with prominent data sandalage, who's also been a speaker at this conference over the years. And the idea with database refactoring just like code refactoring or we safely change the code in small steps, we do the same thing with our back end data sources, including a production. So this book was written from the point of view of a database is being accessed by 100 different systems running on 100 different platforms, owned by 100 different teams, none of which are under my control. There's nothing special with the number 100 it could be 1000 it could be 10,000 could be a million. It's just a large number of systems are highly coupled to my database schema. And even like so think your customer database in your production database your your most important database, it's still completely and utterly trivial to evolve a production database scheme if you know how you're doing it. So here's a quick example. So say I want to rename a call in this case F name. And this is in my customer table and my product in my production data. So if I just the problem is if I just rename it, it's going to blow up 100 systems right because you got I got 100 pieces of hard coded SQL, at least accessing that data. And it's not just about encapsulation layer so the program is among us like well we just need to we just need to use hibernate. Yeah, that improves things but it's not the real solution. And it won't work. It won't work across the board, but anyways, I'm happy to talk about that layer. So what we do is we introduce the column we want first name copy date over and we put a trigger in place to synchronize the data because some of my systems will be accessed in the old column. Some will be access to a new one. And then we give the, we announced the refactoring, and we give the other systems, the teams of the owning other systems enough time this could be months or even years to start using the first name column. And then eventually we delete the old column and the trigger, and the refactoring is complete. Now we need to do this we need this deprecation period is interim schema, because we can't break these 100 systems that are accessing the data. Okay. So anyway, so it is possible. It's a deep topic that I spent one minute over viewing read the book, there's about 65 refactorings. We include full source code. Now there's tools to do this don't get it wrong. But if you don't have access to tools for some strange reason. As long as you've got the ability to type in code from a book you can in fact do this yourself. But anyway, this is real. So anybody tells you they can't that they can't that it's not possible can't be done that they're special. They're not special. It can be done. And the grownups are doing it. So it is possible. Trust me. Well don't trust me look into it. Finally, our stakeholders don't want to work with us. Fair enough they're busy people, but we've also burned our bridges in many cases so a big part of the overall solution to help motivate our stakeholders to work with us is to start working with certain value more frequently. So, move away from big releases that are highly risky that are almost always problematic, almost always don't deliver what we want what people want, and instead, instead deliver frequent value on a regular basis every two weeks every week. Like every day, deliver more value. And when you start doing that, you find your stakeholders are willing to work with you. Because the story of why you tell us my all your parents up front and maybe we'll deliver something six to nine months from now. That's not attractive. Tell me what you want today and I'll get it out by the end of the week or the end of the day. Yeah, okay. Yeah, I'm happy to work with people that can do that. First, you know vertically slice work on question stories and have active stakeholder participation be interesting to work with. So just to wrap up here. The increasing pace of change, the complexity you're dealing with the increasing volume of data demands nothing less than complete data agility. You have no choice. You need to work this way. Otherwise, you're out of luck. It's really that simple. And the rules apply to the data folks. That's, you know, that's the situation we're now in. So anyway, so thank you very much. I think I'm at time. Thank you Scott. Okay, thank you.