 Hello and welcome, my name is Shannon Kemp and I'm the Chief Digital Manager of Data Diversity. We'd like to thank you for joining this month's webinar, Lean Data Modeling for Any Methodology. It's the latest in the monthly webinar series sponsored by IDERA. Just a couple of points to get us started. Do the large number of people that attend these sessions. You will be muted during the webinar. For questions, we will be collecting them via the Q&A in the bottom right-hand corner of your screen. Or if you'd like to tweet, we encourage you to share our highlights or questions via Twitter using hashtag dataversity. And as always, we will send a follow-up email within two business days containing links to the slides, the recording of this session, and additional information requested throughout the webinar. Now let me introduce to you our speaker for today, Ron Huzenga. Ron is a Senior Product Manager of Enterprise Architecture and Modeling at IDERA. Ron has over 30 years of business experience and IT experience as an executive and consultant, spanning a diverse range of industries. His hands-on experience and large-scale enterprise initiatives include enterprise and data architecture, business transformation, and software development. His background provides practical and real-world insights to enterprise data architecture, business architecture, and governance initiatives. And with that, I will give the floor to Ron to get today's webinar started. Hello and welcome. Thank you, Shannon, and welcome, everybody. It's wonderful to have you today. What I'm going to do today is I'm actually going to take a little bit of a take on data modeling in particular, but I'm really going to talk at the first part about methodology to begin with. We've seen many changes in software development methodologies over the years, and quite often we see in organizations that the role of the data modeler has changed, and in many organizations the role of the data modeler is actually diminished, and that actually is a very bad thing because the data is so important to our organizations. So what I'm going to do is I'm going to kind of set the stage for how we actually got here. I'm going to start with things like a brief history lesson because it's very interesting when you look at the way some of the methodologies or methods spring up. They'll often have some very dedicated advocates, and immediately people will say, we need to do it this way because the people who did it the old way didn't know what they were talking about. And that's the type of thing that I really want to try to dismiss here today. What I really want us to look at this as is an evolving body of knowledge that has roots that are deeper and longer lasting than you may actually realize. So hopefully you'll get a few nuggets out of this today to help you realize not only why are we doing things the way we are, but where did this come from and how are these methods adapted to what we're doing now in terms of software development data and those types of practices. So like I said, I'm going to start with a brief history lesson where things came from. I'm going to contrast some of the methodologies to just position them and some of the differences between them. Obviously, one of the things that always comes in is the biological unit or the human factor. I want to talk about that a little bit in particular in terms of how teams are organized. And we also want to talk very briefly about data modeling's increasing value and why we really need to make sure that data modeling is part of all of these projects that we're doing. And I'm going to back that up with a case study and I'm going to talk about the case study. And what I'm going to talk about here is one that started without data modeling incorporated into it where it went and then what actually was done to remediate that and what the impact of the data modeling in particular was to that particular case study. Then I'm going to come back and talk about some of the lean principles because that is the title of this webinar is really applying lean principles or lean data modeling for any methodology. So I'm going to backtrack and talk about how the lean principles that I'll have introduced in other areas really apply into this as well. And then we'll wrap it up and have some Q&A after the fact. So back into history. You probably never thought you'd see a slide like this where I'm talking about the Toyota production system in a manufacturing environment. But I really want to ground folks here because we can start off this entire session with the assumption that there's absolutely no such thing as an original idea in systems development. Virtually everything we do and all the methodologies that have sprung up have been adapted from other areas. Basically the industrial revolution, primarily manufacturing and those methods in one form or another. When we look at this, we can look at things like the Toyota production system. And really that was driven off work by W. Edwards Deming if you've heard of him. And it was really around statistical quality control and those types of things and really how to try to improve manufacturing operations and organizations. And even that has its roots back to something called the principles of scientific management in 1911 done by Frederick W. Taylor. What really happened was that Deming kind of picked up that baldy of knowledge and really brought it to fruition. And the work that he did was a major factor in Japan's industrial rebirth post-World War II. And that was the springboard that led to a lot of different things. The total quality management movement of the 1980s and 1990s. We saw things come into play like Six Sigma was being introduced again through a manufacturing venues in that 1986 by Bill Smith. It's obviously often credited to Motorola, but Bill Smith did work for Motorola at the time. And that ultimately led into a higher focus on quality with things like the Malcolm Baldridge Quality Awards initiated in 1988 and then onward. That really was driving improvements in manufacturing, industrial process and those types of things. And we saw that as a springboard for things like just in time manufacturing, even though that started in the 80s, it really started to gain popularity in the 90s. So you're asking yourself, OK, where's this guy going, talking about manufacturing? Well, all these principles have tied across into the way that we've actually developed systems over the years as well. So if we start with that same timeline back in the 1950s, at the same time that TPS was going on for Toyota, that's when we first introduced concepts called structured programming in the 1950s. That gave way to advancing the methodology to what we now affectionately refer to as the waterfall methodologies, which were really grounded in the 1960s. People think that waterfall lasted a lot longer than that and basically flowed all the way through until we saw some of the agile methodologies coming through in the 1990s, but that's not the case. We saw things like iterative and incremental methodology starting as early as the 1970s and put into practice. The 1980s really brought about things like prototyping and spiral and those types of things. And what I want to talk about now is I'm just going to do this other thing is those first four blocks or those first four decades, the focus is really on predictive methodologies and really trying to turn it very much into a science so you could predict the outcomes and then apply that to other methodologies. But what we really start to saw happen after that is when we started to going into more of what we would call adaptive methodologies. That's where we started bringing in things like rapid application development, sometime called RAD in the 1990s. And then that's where we saw the birth of Scrum in 95, extreme programming in 97. And the Agile Manifesto actually wasn't published until 2001, which a lot of people do refer back to those agile methods then started to gain traction in the 2000s, largely due to the publishing of the Agile Manifesto. But even those methodologies have continued to evolve from there. And all of these methodologies, especially the predictive ones, have their roots back to these same principles in terms of measurement, predictive metrics and those types of things that came out of those manufacturing concepts that we talked about above. So let's talk about a few of them and define them. Waterfall is really a linear and sequential approach to the software development life cycle. It's used in software engineering and product development. And of course, most of us are familiar with it and we realize that it really represents a logical progression of steps that you go through. Things from requirements through analysis, design, development, testing, deployment and then obviously maintaining the solution after the fact to paint a very simple picture here. In contrast, the larger grouping of items called Agile is really based on software development, but based on iterative type of development rather than that sequence that we were talking about earlier. So things that we characterize that as requirements and solutions keep evolving through collaborations of the teams. Also, the idea of self-organizing cross-functional teams came into play. And what we're really trying to do or what they really tried to count is trying to increase productivity and reduce time to realizing benefits relative to what waterfall methodology we're bringing. There are variants of this. There's Scrum that came out that we talked about and there's also extreme programming that we'll talk about. We can't cover all of them and all the people that actually contributed to these. But we're going to talk about those two in particular just because they're fairly popular. Most people are somewhat familiar with them. So when we do talk about Scrum, obviously the Scrum and the Scrum meetings are based from the Rubby, Scrum or Huddle. But Scrum really is a lightweight process framework for agile software development. The book that was written by Ken Schwabber and Mike Vita was a very popular read and it really implements a number of different points. And what they really talk about there is reintroducing flexibility, adaptability and productivity into systems development. There's some characterizations. In the book, they talk about fixed duration iterations called sprints that they set at 30 days. In practice, we see varying amounts of times that are dedicated to these sprints or these iterations that are going on. It's driven by the concept of a prioritized product backlog, which turns into a sprint backlog. And we'll talk about that in an upcoming slide very briefly and how we break that down. It also talks about the idea of self-organizing teams, but there are a couple of very critical roles that come into play in Scrum in particular. There's the idea of the product owner who's the keeper of the requirements, who owns the requirements and prioritizes those requirements for when they're picked up and actually put into the Scrum process. And there's also the Scrum master, which is really like the traffic cop, and they make sure that the process is adhered to by all the team members as well. There's also the idea of daily Scrum meetings. So basically a very brief daily checkpoint, making sure that everything is going well, eliminating blockers and all those types of things. And each sprint is characterized by things like a sprint kickoff and sprint planning session, as well as retrospectives in terms of lessons learned at the end of it. So you're continually building that body of knowledge and evolving the way that you're working as you're actually progressing through the project itself. All good things and all things that really has really helped the adaptability of the software methodologies introducing systems. Extreme programming is the most specific and some people would have returned that as the most radical of the agile development frameworks because it does have some interesting differences to it. They basically start with five basic values, one of which is communication and they really emphasize face-to-face communication and when it came out with the whiteboard. That is becoming more difficult, of course, with distributed teams and that type of thing, but really trying to do face-to-face discussion and really clarifying things is still very important. And if not face-to-face, just having the discussion through whatever means are actually available to you through the technology as well. Simplicity is a focus. What's the simplest thing that will work to solve the problem rather than over engineering and over building what you're trying to implement? And it's based on constant feedback. So you're building, you're getting feedback and you're continually adjusting to make sure you're getting towards the product that you want to deliver. They also have a concept called courage, which is what basically is respectful collaboration or sorry, effective action in the face of fear. And what that really means is they want everybody to feel that they're empowered and when Kent Back wrote his book about this, he talked about it in terms of basically showing a preference for action based on other principles so that the results are not harmful for the team. And what that means is you need to have the courage in those teams to raise organizational issues that are really reducing the team's effectiveness and be able to bring them up without fear of repercussion. And that's really what we're talking about here. And in conjunction with that is a respectful means of collaboration and communication with the team that extends to all aspects that you're doing in the team as well. In terms of practices, quite often driven by user stories so the requirements will take on basically the context of user stories. Paired programming is something that is used in almost all XP projects that I've seen. Some folks have to get away from it but it is actually one of the fundamental tenants that they tout in the XP programming. And they're really focusing on small releases and simple design and continuing refactoring. One of the things that's very important with XP is the concept of continuous integration. So you're doing changes, you're checking them in, making sure that the entire builds and it's good before it goes on. And this has also been kind of an anti-establishment, I guess you could say revolt in terms of what was happening in software development as well. Because a lot of the things that you see here I think came about because software developers were kind of getting tired of being, getting beat up, working 24, seven and those types of things. So they really wanted to improve the quality of life for the entire team as well. So it introduced institute things like 40 hour work week maximums for the teams to make sure that they actually have a quality of life which actually enhances their productivity on a day to day basis as well. So kind of a good organizational focus there. So the contrast waterfall versus agile very briefly again is the waterfall really is kind of that phased in requirements, design, coding, integration those types of things. And when people tend to think of it it's almost like one activity happens and you pass the baton for the next activity to commence. Whereas when we look at the agile methodologies it's really having a lot of these things going on simultaneously in smaller cycles. Some people think of it in terms of throwing everything into a blender and then breaking that up into multiple sprints that has a little bit of each of these there as we move forward in the sprint and we progress to the actual delivery of the solution and a heavy reliance on teamwork. Whereas the waterfall could run the risk of sometimes doing a handoff without communication the real emphasis in the agile or adaptive methodologies is to emphasize that communication amongst the teams. Now in particular and I'll use Scrum as an example for this on the agile types of cycles. Again, a very simplistic and practical cycle. So really you have the inputs but basically what you're working from is a product backlog that is owned and prioritized by the product owner. You have the sprint planning meetings and in those sprint planning meetings the team commits to as much as they can tackle within that sprint whether it's a it's a typically a one to four week sprint one can be a little short 30 days like I said is what they talked about in the Scrum book. I've seen two weeks used very commonly. So you go through the sprint planning they commit to what they're going to do then those make make their way out those items make their way under the sprint backlog the team breaks out tasks based on that and then they start executing that sprint in that fixed time box. The time that the times do not change it's that fixed time box. In the meantime, you've got the Scrum master managing the process. You're always looking at how you're delivering and your progress against deliverable so they're preparing things like burn up chart and some burn down charts and those types of things. And every 24 hours you have that daily standup for Scrum meeting. Again, that's also not an original concept that actually came from the plant meetings and manufacturing as well. And also sprint reviews. So at the end of every sprint you review what was done it's a lessons learned or a retrospective is a very common term. And you also have the finished work and then you kind of rinse and repeat and go into the next cycle. All good things but unfortunately quite often in different organizations agile methods have been misinterpreted and they've actually ended up being misaligned to what you're trying to achieve in the organization. Part of this is a lot of organizations have had too much of a short term project perspective versus a long time longer term organizational benefits. Now one of the things you want to think about in the context of this discussion as well is in those last few slides talking about those methodologies how often did you hear me use the word data? And the answer is zero because those methodologies were very focused on software development and unfortunately while data is still important a lot of them and a lot of the people practicing them look at data as a byproduct product rather than being front and center as part of the deliverables of what you're doing. And that's where a lot of this misalignment has occurred as well. So we want to bring things back into perspective and really focus on longer term organizational benefits as well to get that alignment with what the business really needs. Basically in AdJoc again they state it's all about producing usable software in every iteration. That's also misinterpreted quite often just because you're producing usable software in every iteration that doesn't mean you're deploying it to production to your business to use. You may be deploying it to a QA environment for a group of business users a test but if you have two week sprints and you keep throwing software changes out at your users every two weeks they're going to have a hard time grappling with that change. From my product management perspective I see that with our customers as well which means as we're rolling out changes in our products we basically stage them, package them and do them in practical releases so that they're consumable in manageable chunks by our customers. And that type of practice works very well in internal development methods as well. Unfortunately, AdJoc often uses an excuse and this isn't the methodologies themselves but this is the way that people are misapplying them at times is they're using this excuse to shortcut or admit other important deliverables that are not the software itself. And quite often the things that fall off the wagon are things like data architecture and integration architecture, documentation and because of the very short-term project focus that may come into play people often aren't including things like decommissioning of the replaced applications and legacy systems as well there as well which basically contributes and adds to the clutter rather than actually making things simpler and more refined going forward. The other thing that drives me crazy and I've seen this happen many times is people are often interpreting requirements too literally. I've seen many developers say, well the business user didn't tell us that and it's often about things like how they should be architecting things or doing things like making sure they're setting status flags and those types of things to track progress in an application and that type of thing. The business user is not going to tell you that. They're going to tell you their business requirements but we need to have that architectural discipline to make sure that the underlying design supports and has all of those types of controls that we need to really make the applications and of course the data function correctly. Again, we wanna remove the blind focus on software only. I've actually heard it said and in fact in one of the books is that models are a good documentation but they are immediately obsolete. That's a misapplication of the principle of modeling. This is because people often will use modeling after the fact to document what they produced. Whereas what I'm gonna talk about is that if you use models as the living, breathing solution particularly in terms of data and generate your databases from those models they are not obsolete. They are the working copy of what you're actually going to implement and your ability to prove out what you're going to implement before you do. So let's talk about the human factor for just a moment. Again, the way teams are organized between some scrum and extreme they are self-organized but there's also again a bit of a difference in terms of the self-organizing team concept. Quite often this is kind of misinterpreted and often actually sometimes stated as a roleless team. And particularly with projects that have been implemented with the extreme I've seen this where people say everybody's the same anybody can perform any role within this particular thing and they might actually change who switches the role from sprint to sprint. And it really de-emphasizes specialization. The reality of that it's a formula for a disaster but all the simplest of projects. What it's really doing is it's downplaying the specialized skill sets that people bring to bear. And that means architects, developers, business analysts, data architects and modelers we all have specialized skill sets. And if we organized and bring those special skill sets to bear as part of a highly functioning team we're much better off than trying to do something that in more generic fashion and you're not gonna have that body of knowledge that's really helping you out. To use this as an example I basically and most of you heard me for know I'm a pilot know I'm very interested in aviation. I'd like to apply this self-organizing roleless team to the concept of an airline. I wanna know how many of you would actually wanna come on a flight for me if we took the pilots, the flight attendants the baggage handlers, the ground handlers and we put them all together in a little huddle before a flight started and then we decided who was gonna perform which role just before the flight started. You wouldn't know who is gonna fly the plane or if it has a credentials to do so or anything like that. I'm willing to bet that not one of you would get on that plane. So if we're not willing to apply those principles to something like that why would we apply those principles and risk our businesses is the question I'd like to ask. Now I'll get off my soft box right now. What we've often seen by this when we do experience this type of thing is quite often there is a spirit of disdain for the data modelers. Phrases like developers saying they just slow us down or we don't need a data model to implement the solution. And quite often this is perpetuated even further by short-sighted management of the organizations because they're looking at a specific project and they're compromising the long-term in favor of short-term project goals but you need to continue looking at the bigger picture as the backdrop to this. Let's talk about the data architecture modelers role at Agile very briefly. What they need is to bring an enterprise data perspective and again, not all businesses are created the same. There's a lot of overlap, a lot of commonality but generally speaking the data model is unique to that particular business. There are always aspects of it that are more important or less important but you're probably gonna find that there's a very common core across a lot of businesses as well. What you really wanna do because that enterprise data perspective really makes you wanna be a facilitator to really enable the teams to function and utilize the data and make sure things are being done properly. You don't wanna be perceived as the negative gatekeeper and unfortunately sometimes this is what has led to some data modelers being on the outside looking in because they were very dogmatic in their approach and not wanting to learn to do a patient type of style within these teams. You really wanna participate and you wanna be an enabler versus a negatively perceived gatekeeper and you wanna get rid of that us versus their mentality and you really wanna have that blend of business background and being able to align yourselves and basically equally communicate with business stakeholders, the technical teams and really bridge that gap for everybody. So you're very consultative in fact. What you'll typically find as well is your data modeler needs to have full engagement at all sprint planning if you're doing something like Scrum. You need to ensure the completeness of deliverables and you also need to make sure that the team understands prioritization of dependencies. Quite often there are dependencies that are driven by the data and I'm not just talking about things like referential integrity with just general dependencies that as a data modeler architect, you may have the inside track or the visibility of that so it's your job to communicate to that team so the work can be organized. You need an iterative work style and you have many simultaneous deliverables. Quite often teams or data architects and modelers are over allocated where they may be working with multiple teams simultaneously rather than the pure concept people dedicated to one team. And if that's the case that you nearly need to maintain that cross project focus, the strength of that is you have the visibility of what's going on across those projects. So that again is knowledge that can be brought to bear in all the other projects that you're dealing with as well. In terms of data modeling, I'm gonna talk about this very briefly. The data model separation is very important. Conceptual models to drive a home initial concepts, logical models where you're starting to elaborate those concepts more and physical models which ultimately end up being the physical deployment of the databases or data stores that you're building out of these projects. That includes dimensional models and it also includes models for no sequel technologies on the physical side. Data lineage important, understanding where that data has come from and where it's going in the organization and often business process models either done partly by the data model or in collaboration with business analysts is extremely important because it provides the context of how the data is used in the organization as well. On the data model specifically, like I said, it's a full specification. You have the logical models, you have the physical models which are all the detailed physical specifications and it also includes things like resolving those gaps between the way businesses and maybe even developers think versus the way the data architects think. So we're talking about things like persistence boundaries or what we call business data objects. In other words, as data architects, we think of an order being comprised of a header, a detail and other tables. Whereas a developer or a business user just knows this as an order. So we shield them from some of that underlying complexity but we still incorporate both in the way that we're building out our models. The descriptive metadata is extremely important. It's not just about delivering software in a database. It's about understanding what it means. So that means meaningful names, proper definitions, supplementary notes to know how this data is utilized in the organization. The project is not complete until these types of deliverables are put into play. All the implementation characteristics, data types, keys, indexes, views, all of those are models. And the business rules. Your relationships in your data models are a statement of business rules that end up being referential constraints in your physical models. You're also your value restrictions end up being check constraints in those implemented databases as well. So time spent doing it in the model means you generate them from the model rather than trying to hand build them as developers as a second step. And again, things people often forget about that other metadata around it, security classifications of the data, rules for how you access the data, and other kinds of governance metadata in terms of classifying your data for master data management, data quality classifications, retention policies for archival and purge, all of those types of things need to work through. From a governance metadata example, this is what I'm talking about. Just a very simple thing with the kind of the bottom two boxes here, things like incorporating master data classes right into your models, even introducing things like data retention policies, a measure of stability of the data is also very important. And privacy and security is at the forefront, especially with the governance considerations that we have. With that, I'm going to switch gears for a moment and we'll talk about a case study and how it rolled out. Really what happened here is I'm gonna go through this and then I'm gonna come back and talk about some of the lean principles applied to this as well and how it actually improved the overall approach to what was going on. This example was a supply chain application. It was actually a new business unit and they were building out a full commercial application suite. You can think of it as kind of the order to cash cycle if you want everything from placing orders right through to receiving the cash at the end, all of those types of things and that full cycle was considered in that. The deployment was actually one common database for these applications and the data structures that were supporting this application, but it did integrate with other systems such as data acquisition systems and those types of things as well to import data and actually update statuses and those types of things. There were four parallel development streams organized by functional area. The plan duration of the project was one year at a cost of $6 million. It was an agile project and it was a combination of extreme and scrum but very heavily weighted towards the extreme side. They were using some scrum discipline in there as well. The way this project was stood up, it was that self-organizing concepts where the developers were responsible for all the design and development and it was broken up into two-week sprints or iterations as they worked through the project. The weekly budgeted costs on this were just under $93,000 and that was the direct cost of the actual project team itself. The project also included business subject matter experts but their costs were not incorporated here because they considered them overhead that were in the business. So they were just part of the corporate budget anyway. So this is direct resources that we're talking about here. So this is what happened in the initial project. Obviously new initiative, people don't get to work on Greenfield very often so everybody involved was quite excited. The business was excited with this new line of business spinning up and they were anticipating all kinds of great results. They tried to do a lot of the right things by co-locating business users with the team and those types of things, even though it was a distributed team in a couple of different places, they tried to co-locate them as much as possible and they had some people traveling back and forth as well. And they were expecting great things out of this. After a while, some of the reality started setting in. They started seeing things like, well, there's a fairly high defect rate that's really coming back in here in terms of making its way back into the backlog. The backlog was growing rapidly but the more they looked at it, the more they saw that that was actually not features or things that were actually supposed to be built but it was really defects that were actually starting to accumulate more and more in that backlog. By the 16th week, four months in, 50% of their effort was being spent addressing nothing but defects. When you take that direct cost, that means about $46,000 a week. And then using the burn down charts, they predicted this and said, if we look at this the way it's going, it's going to add another 40 weeks to the schedule, which was something that could not happen because it would have delayed the release of that business unit as well. And it would have run an additional cost of $3.7 million to the project alone, never mind the lost business opportunity by delaying that business unit as well. So very serious problem. So with that, we need to come in and assess what the problem was. And this is where I was actually brought in to assess what was going on in this particular initiative. I'll kind of take a step back and explain my background a little bit. You've heard that I've had a background in multiple industries. I did start my career in manufacturing. My background in management as well as technology. And I was also very into, had a very keen interest and quality in things like the Six Sigma movement and everything that grew out of that because of my manufacturing background. And I am a Six Sigma Black Belt as well. I'm not going to try to turn you into Six Sigma Black Belt here, but the Damaic cycle, in other words, defined measure, analyze, improve and control cycle that's a part of Six Sigma was very useful because it allowed us to assess and measure the problem, define a solution for it and then make sure that we are actually turning it around and correcting it properly. So we're gonna walk through that a little bit. In terms of define, we really looked at what were the categories of defects that were coming in so we could arrive at some types of conclusions. How do we measure them? So a discrete measurement in terms of number of errors, number of defects, but not all defects are the same as you know. So we also looked at the weighted impact of those errors. And as you'll see in some of the graphs I'm gonna show you, sometimes you get a lot of noise looking at that if you look at a period by period basis. So using things like cumulative measurements to introduce some smoothing so you can really understand the trends of what are going on is important here as well. In terms of the analysis, we looked at things like time series distribution of defects. We looked at what that translated to in terms of defects for object produced. And a concept that I'm not gonna go into really deeply but you're gonna see some numbers here is a very important statistic in Six Sigma is not just defects per object, but really the defects versus the opportunity to create a defect. So in other words, if you're delivering things whether it's items off a production line or if you're delivering software features or data constructs or those types of things, every time you produce something there's a potential to introduce error. So as you're measuring the number of defects you actually originate versus the potential to create those defects is all that's really about. The improvement is what's the remediation strategy to address this. And then once you've gone through that there's a control period. So that's when you wanna still maintain those comparative metrics to say are the changes that we've implemented truly making a difference and do we have tangible results to back that up? So we'll go into the defined stage very quickly here. This is known as a fishbone diagram. And when we looked at the area and I'm not gonna read through all of the items here there were four primary areas where we were seeing the defects come in. Some were in requirements, some were just in the way of things like the user interfaces were being built and feedback based on that. Quite a few on the database and persistence layer that were in the application and also others in the business services. And of course all those were feeding in a system defects and required rework to address them. When we looked at them though the highest potential impact of course was coming out of these database and persistence errors because of course as your database changes and your persistence changes everything on top of it changes as well. So that's an area that was really an area of concern. When we looked at the cumulative defects just based on face value on defects counts 35% were in the database and persistence down to 30 on business services, 28% on user interface and a low 6% on requirements. And again, quite often a bad requirement could really have a bad ripple effect but because the error rate was so low on requirements that kind of ruled that out as being the thing that we'd concentrate on first. There were things that were actually taking a much bigger bite out of what was occurring in this project. So we took those areas and we actually took those categories and looked at the actual defect count as well as the cumulative count to produce some graphs and the cumulative defect rate. We also wanted to look at the different severities of those areas that were involved. So basically it came up with a weighted score of the counter defects by the average severity points of the types of defects to give a cumulative score. You don't have to worry about all the detailed math here. That's really the principle I want you to get out of this particular slide. Because when we looked at that the graph is always easier to understand again showing the categories down below but then the graph above or the line graph above is showing that cumulative defect. When we looked at the weighted score that brought our database and persistence up to 63% of the problem followed by business services and then UI and requirements. Again, a partial glimpse to some of the scores for some of the things that were happening the data like duplicate tables were very high and not normalized tables, incorrect primary keys. Those of you that work with data can see, yeah those are for some pretty dangerous areas but even things like missing constraints or incorrect naming or incorrect data types they all contribute to the problem. Again, we took those and based on the actual count of defects in each of those categories as well to come up with our total score. And then we looked at how the defects were distributed over that 20-week period. And we started looking at that because of course we started the analysis the project was still going on by the time we analyzed it the other four weeks it elapsed. And you saw it looks like a scatter chart almost like a stock market diagram here but what you're really looking at is objects created as the red and the blue is the number of defects introduced and you see that even though there's a lot of variability there they almost seem to be tracking relatively close to the same. The way to clean up that noise and really understand is what going on is looking at the cumulative curves based over by accumulating all those time periods. All of a sudden we have a much smoother curve to look at here, but here's the tell-tale sign our cumulative objects in the blue are actually less than the cumulative defects on the top. That means on an aggregate they were producing more defects than they were producing objects which is a very serious problem. So this obviously required some very drastic measures for remediation. First thing was let's apply some of the lean principles to number one increase efficiency and eliminate waste and obviously focus on defect reduction in particular build the quality into what was actually being delivered create knowledge amongst the team about how to remediate this and what the plan was to remediate it and continually optimize all aspects of this going forward. It also meant bringing in a senior data architect with a cross team focus introduced in week 21 of the project developers were no longer developing or designing the database themselves. So in terms of process changes all data changes were modeled and the database was generated from the modeling tool. Developers could still work in their sandboxes to propose ideas and refine things but the sanction data structures came out of that data model. There was also one developer dedicated to persistence mapping because they were using one of the objects persistence frameworks. So rather than having multiple teams that didn't necessarily know how to utilize that as well is there was one dedicated persistence mapper that worked for the data architect to make sure that the entire data layer was laid out cleanly each and every time. Also, rather than just continuing there was an actual halt in functional design and development so we could reset and recover from the problems that had been done there. So it was basically redesigning the database based on what was known to date and then continuing to refine it from there. And that meant a few sprints that were also dedicated to cleaning up the problem just in an attempt to get things back on track. We also set a target because you always want to measure your success against the target. And the target was to reduce the data defects by at least 75% going forward. Some may think that's a very aggressive objective but it's something that needed to be done to get things back on track. So let's look at how things played out. What you're seeing now is a continuation of those charts that we looked at earlier. And the scale has changed a little bit because of the way things have gone here. There's this hard line at week 20 and of course week 21 is where we started introducing these changes. You look at the number of objects beyond that line and you actually see a couple of major spikes in number of objects created but what you also see is a very flat curve in terms of the number of defects created at the same time. That means goodness is happening here that we were actually having success in addressing those which you're also seeing in those terms of those major spikes in the number of objects is the database redesign. And that includes a lot of things that were missed in the original database in terms of doing things like making sure those referential constraints, check constraints all those types of things that we would normally expect were there, the majority of which were actually missing. There were basic constructs in the original database but it was not a clean design by any means. We take this now looking at that other curve of the defects per object. And again, we see a lot of noise there again in that first period. But again, we see the same basic trend happening there is that the defects per object is actually gone down significantly. And look at the numbers. Like when you look at things like the magnitude of those, you're seeing things like 1.5 and points like 2.3 in week one as an example whereas they're near zero in all the weeks that are following that week 21 where the changes were implemented. This is the tell-tale chart right here because this is the one that took those cumulative objects and cumulative defects that we saw earlier and then started projecting that beyond week 21 when we introduced the changes. Very quickly within those first two weeks all of a sudden we had more objects than we had defects. The defects flatlined and the number of objects continued to increase. What's meant we were delivering a lot and we were delivering it with a very high level of quality and that's what you really wanted to see out of this. Let's look at some numbers here and I'm not gonna go through and justify all these because we'll be here all day if I go through the math behind it. But basically it means we had more objects created in that 11 week control period than we even had in the original 20 weeks. We had far fewer defects and when we look at the defect opportunities we actually had more opportunity to create defects in that shorter period of time than we had before. But the things that we wanna look at here is defect points per week and the bottom yellow one which is the defect points per opportunity was an improvement by 1972%. Our target was to improve defects by 75%. So a huge win on that front. But what does this really mean? Bottom line was the project got back on track and completed on time. $3.7 million overrun avoided. The introduction of data architect and the modeling tools for that remaining period of time 200K, return on investment is the 3.7 million minus that 200K over the 200K spent which is 1,750%. If it had been done at the beginning of the project the returns would have been even greater. I don't think anybody having seen numbers like this would not want to follow this type of an approach. So now very quickly, what's lean and how did we apply it here? Again, we talked about earlier that it had that basis in manufacturing and things like knowledge and it has been adapted to knowledge work and software and methodology and systems methodology. It grew out of things like the Toyota production system that we talked about right at the beginning. The really important thing is it's an organizational focus versus a pure software focus and that's extremely important. What you're really looking at is a repeatable process to minimize waste and maximize value in the organization but it requires quality standards, collaboration of specialized workers, recognizing their roles and the contributions they have to make. And of course there's also this fundamental philosophy of Kaizen which is a Japanese word and Kai means change and Zen means good. So in other words, change for good and it's rooted in continuous improvement and always making small incremental improvements in all areas of the company or all areas of the initiative that you're looking at. Some of the basic lean principles that are part of lean, eliminate waste, we've talked about that. Eliminate anything that doesn't add value. Build quality in. Quality isn't QA's job, quality is everybody's job. It's test-driven incremental development with causes feedback when we're talking about software and systems. And in manufacturing, it's also empowering people to do things like stop the assembly line if they see something happening in a product that's being manufactured. It's empowering the people to stop that line, rectify what's going on and then resume production which is very similar to what we did here as we stopped the entire process, rectified it and then moved on. You need to create knowledge and that means properly documenting and retaining valuable learning and that means documenting things like your data models, creating wikis and those types of things for how to use the applications that you're building. It isn't just about the software, it's about the entire deliverable, the data, the software, the documentation, all of it is part and parcel of what should be delivered. And of course, you wanna deliver fast. You wanna remove blockers that are getting in people's way and you also wanna make sure that you're concentrating on the right things and not over-engineering what you're doing. All aspects, you need to be respectful. Communication, handling conflict, onboarding resources to teach them how you go about it, how you're improving the processes and again, making sure that you're empowering the people to do the right things. And optimization, don't sacrifice quality for speed. Now, obviously you want speed but quality is the paramount concern. You also want to make sure that your understanding capacity and the downstream impact of all your work as well. You're not working in a vacuum and you don't have a two week binder on it just to sprint. You wanna know what context you're working in and what the end deliverables are is that frame of reference. And you're continuing looking at how you can identify and optimize the value streams out of the work that's being done. Some people think Agile and Lean are the same but they're not and here are a few different things that actually set them apart. Agile was proposed as a better way of developing software. Lean is strategic as well as operational and it was really when applied to these types of things it's really meant to improve IT's value to the organization overall, not just the software. Agile is very much a bottom-up focus with short cycle frequent delivery characterizes part of it but the Lean still maintains that top-down end-to-end focus as well as that day-to-day operational business or because you want to see the whole of what's going on and track versus how you're delivering that whole. Kanban is something that's used commonly and if you're not sure what that is quite often people use this thing for things like Agile board. So when you see things like the sticky notes and that type of thing that they're using to track tasks and those types of things that originated from Kanban in manufacturing what that was is basically a pull philosophy where when you were feeding components to the next downstream work centers as part of the overall initiatives for just in time manufacturing, those types of things you just didn't keep pushing that product onto that following workstation. When that workstation required more they would literally put a card back to the previous workstation to say I'm going to need these and then those would be produced and pulled up to that other workstation. So it's really more about maintaining a manageable continuous flow and the real focus there is limiting work in process. Whereas in the Agile it's really limiting time of development within those fixed time box sprints. So there is a fundamental difference there. Also in Agile, each iteration begins with a fresh board. The overall backlog begins but each sprint has a new sprint board or a new take on the board. But in Kanban it's not starting new. It's basically when a task completes you're pulling the next one in sequence. So rather than having fixed time box sprints and changing the deliverables you have fixed milestones where you're evaluating your progress but you're continually pulling the next task or next deliverable in sequence which actually leads to a much smoother type of delivery. Again, Agile focus on delivering software. Lean focuses delivering real value not just the software. And in this context people do like to bash waterfall when they're talking about Agile. But in this context Agile has just become the new waterfall because there's so much more that Lean has to bring to the equation. Now very quickly because I've got a few slides left here just to kind of talk about how we applied this in that project and other projects that I've worked on. And this is basically what the data model or data architect is doing within the sprints and how you're managing it. Participate fully in the iteration planning. And I'm gonna use the generic turns in case you're not using Scrum or XP here. Always make sure that there's a main release or a version of your model as of the completion of the previous iteration because that way you always have a baseline for compare and merge. And that allows you to compare what your state was at the beginning versus your state at the end whether it's a sprint or a milestone in a lean type of development. Structure your model and utilize what we call sub models or subject areas so that you can actually communicate the relevant data to the appropriate audience for that topic. That helps not only the technical staff understand the data that comes into play but it's actually understandable by business users if you do it correctly as well. And quite often you may tie it right back to a user story level if that's really gonna help facilitate the communication. Any changes you're making there would automatically roll up to the parent level models and of course in a tool like ER, studio like ours anything you do at a lower level you can roll up and you can have multiple levels and really organize it along a business decomposition if you like. In terms of managing the iterations themselves again always have a baseline for compare and merge I can't overemphasize that. Within the iteration the workflow is model each change associating each change with the appropriate task or user story having that audit trail is extremely important. Generate incremental DDL scripts and stage them to the build server because I'm assuming you're doing something like potentially continuous integration here. To do that and have the builds work the data changes and the software changes need to roll out into the build simultaneously or the build is going to break. In some systems that means you also need to have a robust script naming convention because some build systems actually do it based alphabetically so you need to make sure you have ordered script and those types of things as well. Again, one data model might be working with multiple death streams simultaneously so you are gonna be a jack of all trades and juggling. Some designs are going to be originated by the data model others are gonna be by the developer kind of playing in their sandbox talking to the data model or saying here's what I think we need the data model will evaluate it against what's already in the model refine it and then push back the corrected and sanction DDL to the developer who will then refactor their code is appropriate for any changes that may have been put in there and everybody uses the officially sanctioned script from that point of view. Again, end of iteration create your name release create a delta script using compare merge again because that gives you a composite script from what happened at the beginning and all the changes that happened in there to the end. You could run all those other scripts in sequence but it's a very good convenience of time to just create this one delta script that does it all especially if you're moving on and doing more and more. Again, some models for the audience specific perspective I keep emphasizing that and just keep maintaining the discipline of model first and generate and that way you'll stay on track. Full participation in the planning and retrospect is every single time. And in terms of the change management what I'm talking about here is again in something like ER studio when you're making a data model change you can actually reference tasks if you're using something like Jiri you can reference them in there when you're checking out or checking in your model changes you pick the task that it applies to our user story you get a change record and the nice thing about that change record is so as you what's changed that's what those yellow Delta symbols are and it ties us right back to the user story that it came from. So it's also very valuable for governance and passing audits. Managing the complexity is have an overall plan guiding the initiative and that means require analysis and some modeling before development starts. It's not everybody at the start line at once it's being smart about how you organize the project and having that critical mass there before you launch headlong into development. Some areas are gonna be very complex and they may require multiple iterations to design and develop. You cannot do everything in a two week or a one month sprint you've gotta break it down. Use the data model design patterns as a starting point for things that are common. There are data model design patterns just like there are software design patterns we take advantage of them and use what I call the wave approach and this can be controversial but for some people but that means not everybody's working on the same thing at the same time. Sometimes your data models or maybe even your business analysts may be working on some items one or two iterations ahead of the development team. It's to really start to pave the way and get that critical mass there so the developers can actually roll into developing solution. Think of it like waves in an ocean as you're just having this constant progression and that ties in very well with that continuous flow manufacturing analogy that I talked about as well. Logical and physical modeling separation facilitates this. I can be working on that what the design is going to be in a logical model where even if I'm not changing the version at a particular point in time whereas my developers are utilizing the generated database from the physical model. Once we're happy then we actually do a compare and merge logical model to the physical model to push that change in the physical model that would generate the DDL and the change grips and goals on there. It's always appropriate to do the right things at the right time and the appropriate changes at the right time and again keeping that enterprise perspective of the data. In this type of thing the data models need to be fully documented. You're not going to get it all again in one sprint or one iteration but by the end of the project everything needs to be documented that includes data dictionary definitions all relationships and roll names should be documented and the physical model is the implementation and that means all the physical construct should be represented. Compare and merge I'm just going to go through this very quickly. What compare and merge is about is you take the left side whether it's a logical model or your existing state of the database versus the model changes you may have made and you can compare all the differences and you can push the changes into the target. You can actually do it bi-directionally if you need to so if you find some defects on one side versus the other you can actually pull things back into the model as well as push them out into the database as well. So there's a lot of flexibility here and by doing this with platforms like SQL Server and those types of things you can generate your net change scripts as well. This is an example of obviously an alt or script or a change script they're a very simplistic one showing the different things on SQL Server you're probably going to augment this because quite often you'll also have quite often have data that's part of this as well so you might do things like preloading things like reference values and things like that to add to these particular types of scripts. And what I'm depicting at the bottom right here is if you're doing something like automated builds you're probably going to check these generated scripts into a build system in conjunction with the other development components whereas your build system will pick it up and generate it. And of the iteration again create that name release a completion it starts as a baseline for not only the start of the next iteration but anything after that you can always do a comparison between name releases at any later point or actually any former point as well. So you can always see what the journey was on the way through and I'm just going to go through these. The DDL script is what you use to create sandbox environments and you say you have another developer you're starting a new sprint they push that out they refresh their sandboxes and away they go. Obviously you're going to promote these through your dev QA and production environments at the necessary times as well. And you also want to make sure that your models have been fully published that people can reference as well. And again I've talked about this as well. If you have an automated build system again I've already emphasized this but you need synchronized deliverables that means your database in other words the DDL that's being checked in to do this your application code your changes to your persistence framework or your data services and the framework updates all those go into the build system together. Then what happens is the automated build goes up the more successful projects I've been on and in fact what we ended up implementing in that other project as well is if the build broke we actually had flashing lights tied into some of the workstations. So if the build broke we had red flashing lights in the room and everybody including the business users knew that the build broke for some certain reason and it was all hands on deck to resolve the problem before we actually moved on with anything else. So it was very effective in resolving the errors right away shutting down the assembly line and kept working on things once we had the problem resolved. This was a lot just a very post-flight brief here. Systems development is continually evolving and improving. As I've kind of pointed out despite claims to the contrary there really have been no brand new groundbreaking ideas that came out of software development itself. Most of what we're doing is derived from manufacturing principles and industrial engineering principles and practices that have been proven to deliver business value in other venues. We need to learn and adapt based on that cumulative body of knowledge whether it's manufacturing, software, data, those types of things. And very importantly is all organizations are different. So you need to adapt and fit this to suit what your organizational culture is as well. Data has always been important more companies are recognizing that. Just remember applications come and go but companies always want to retain the data that they captured and you're doing data conversions and everything else when you're replacing applications even though the applications are being tossed that data remains and it's the important focus. To handle all of this data models are more important than ever. We have increased complexity that we need to manage with multiple data stores in different areas of the business. It increases the quality of the deliverables it delivers a higher level of value and more importantly by taking this discipline approach it really is an insurance policy to avoid a failure. The lean principles improve systems development. It's a value focus really focusing on efficiency reduction of waste and of course customer satisfaction whether it's internal customers or external customers that you're dealing with. And generally speaking approaches utilizing lean are the most successful. Predominantly adaptive in nature but you also want to maintain some of those predictive capabilities because that's your basis to a measure and evaluate how you're tracking and moving forward. So you really utilizing the best of both worlds. Beware of anybody who says the old ways no good this is the way we do it from now on. It's always a building body of knowledge and we're moving forward that way. Again, that was a lot. So thank you for that and I didn't leave a lot of time for questions unfortunately but Shannon I'll take whatever we can get through now. Thank you so much for another great presentation and if you do have questions feel free to submit them in the bottom right-hand corner of your screen and just to answer the most commonly asked questions I will be sending a follow-up email by end of day Thursday for this webinar with links to the slides and links to the recording of this session. So Ron, can you speak to and enter team communications among current projects and related reconciling project specific models with enterprise models? I think what again I've talked about that topic previously but again that really ties into that short-term focus of that project-based versus that longer-term focus which is enterprise-based. So from a data perspective I guide everything with the idea of an enterprise data model. When you're working on a particular project you're working on a particular area you wanna make sure that what you're doing is aligned with the principles of your enterprise data model and while you're doing this you're also updating and reflecting the necessary types of things in that enterprise model as well. So it's a constant reconciliation process that goes on. As soon as you do one in a vacuum without consideration of the others is when you actually start to run into problems. Alrighty, well that does bring us to the top of the hour but thank you so much Ron for this great presentation again and thanks to our attendees for being so engaged in everything we do. If you have additional questions I will get them over to Ron and make sure and I will include his contact information there as well in the follow-up email which I will again go out by end of Thursday. So Ron thank you again so much. Thanks to everybody. I hope you all have a great day. Thanks all. Thank you everyone.