 Hello and welcome, my name is Shannon Kemp and I'm the executive editor of Data Diversity. We'd like to thank you for joining the State Diversity Webinar today. How can you calculate the cost of your data sponsored by PACSata? Just a couple of points to get us started. Due to the large number of people that attend these sessions, you will be muted during the webinar. For questions, you'll be collecting them by the Q&A in the bottom right-hand corner of your screen. Or if you'd like to tweet, we encourage you to share how these are questions via Twitter using hashtag Data Diversity. As always, we will send a follow-up email within two business days containing links to the slides, the recording of the session, and additional information requested throughout the webinar. Now let me introduce to you our speaker for today, Michelle Goetz. Michelle comes to PACSata with over 20 years of experience in marketing, data management, and analytics. Previously from Forrester Research as Principal Analyst, she was one of the most sought-out analysts leading research on data management and artificial intelligence, helping leading enterprise organizations and government agencies define their strategies, best practices, data governance programs, and technology decisions. Both notably, Michelle championed research on data preparation solutions and practices. And with that, I will give the floor to Michelle to get today's webinar started. Hello and welcome. Hello, Shannon. Thanks for having me today. Hi, everyone. Thanks for joining us. We're going to talk about how you can calculate the cost of your data. We're always looking to maximize the value that we get from our data. We're looking to get better insights, how we can improve our operational efficiencies, and how we can grow our businesses. So as we look at our data, we're really thinking about, you know, what does it look like to manage it appropriately? How do you handle the quality and trust of that information? And how can you be efficient and effective in doing so? Well, let's start off with, you know, what does great data look like? What does it help you do? And we're all very familiar with the airline industry, and we all know that we're continuously shopping around for the best price on our seats. But the prices on seats are always this ambiguous black box. You don't quite know what the price is going to be. You don't understand from day to day how it's necessarily going to fluctuate. And you can certainly be sitting next to somebody who has purchased that seat for more or less than you or not paid anything at all for that price. Well, it's not by accident. It's because the airline industry, out of years of pressure, first from deregulation and then pricing wars, had to figure out a better way to be competitive in the market or go under. We've seen, you know, certainly during the 90s and early 2000s a number of airline companies just really not being able to survive, particularly as there were a lot of price wars coming up from, you know, a number of upstarts in the industry. And so what did they do? They turned their data and they said, what do we know about our travelers? What do we know about the prices that they'll pay for our seats? What do we know about their booking habits? And what do we know about the places that we fly to and where people are interested in? And there's also a number of, you know, pieces of information that can be organized within this analytics, but at the end of the day what it helps them do is introduce dynamic pricing. And this is where they can be competitive and this is where they can survive but also where they can grow and offer other types of offers and packages and even other travel experiences that you might not have expected before. So certainly analytics was the powerhouse behind this. But at the same time, the data had to be trustworthy so that they could rely on the analytics. And at the end, you know, so let's just take a look at this. You know, if you think about what they've been able to achieve, are you able to demonstrate business ROI, you know, from the investments in your data or is it inhibiting it? So I thought we would turn this over to a poll. And Shannon, if you could help me out here so that people can participate. And I'll give just a minute. Oh, give me just a minute here, Michelle. We're going to get in that for you. No problem. So while she's getting that together, you know, one of the things that has always been fascinating to me is to see how do organizations go about actually, you know, determining the ROI. How do they make decisions about their data investments and what are the business cases that are being put together? And we find a range of how organizations go about it. It could be that it's loosely tied to initiative. There's a reason to, you know, to recognize that by reinvesting in new and modern capabilities, you can reduce your costs. But usually some of the more tangible ways of identifying, you know, ROI are often hidden. But it can certainly be a challenge. Let's see how we're doing here. So I think we've had enough time, Shannon. Maybe we can take a look at the results. It seems like we're having a little bit of a challenge with the polling right now. So I'm just going to move along. Oh, actually, it's going... Oh, you got it? It's going right now. Is this closing right now? Sorry, Michelle. It's closing right now. We've had some last seconds answers coming in. And it is closing and submitting. And I will push the results here as soon as that is done. We have some jeopardy music. There we go. Oh, I wanted to see that vote. Yeah, so it looks like a number of you are definitely saying, no, it's really hard to demonstrate ROI. And that's pretty typical of what I've seen. You know, there's an understanding about where you want to invest data and what area of the business you might want to be supporting. But nobody's really put hard metrics to that. And it's always, you know, even after the fact, trying to measure that is often very difficult. But hopefully as we move forward, I can give you some things to be thinking about. And even if you can't get to the point where you can have the hard numbers or all of the facets that go into this, at least hopefully I'm giving you a model and something to think about as you're building your cases for more investment within your data. You know, certainly having, you know, had conversations with Artnor and looking at some of the research, what we're finding is, you know, by 2017, 33% of the largest global company is will experience an information crisis due to the inability and inadequate value to govern trust and trust their enterprise information. That's actually a really large number. 33% of companies are going to have challenges around that. And it really points to the fact that, you know, we don't fully have our arms around the way that we manage data today. Certainly within pockets of the organization, if you're in financial services or another highly regulated environment, there are efforts underway to ensure that you have strong management governance and quality practices in place. But other areas around maybe customer experience or logistics or inventory management, you know, some of the other operational areas of your business or some of the transactional areas of your business aren't necessarily as well-governed. And certainly within areas within your organization where you're doing much more exploration and discovery for your analytic habits, you know, not everything has been, you know, put under a governance policy and you're having to really play in the data to understand, you know, the meaning of it and understanding, you know, the value or even the risk around those things. There is so much data coming into our organizations today. It truly is difficult to have a handle on all of that. And we're working in a state where it's probably a little bit more chaotic than, you know, what we've been used to, you know, in the past. And we have to come up with new practices to accommodate that. So where are some of these areas that organizations are feeling the impact, either through lack of governance or transparency or just challenges in generally managing information? I mean, certainly there's missed revenue opportunities because maybe you're not using all of the information that's available to you to have the most complete insights to make decisions. There's customer churn for, you know, some very similar reasons where you might not be able to see the different interaction points that you have with customers engage with those experiences. Inaccuracy and forecast certainly occurs by maybe, you know, missing pieces of information and ability to gather information in or account for a lot of the dynamic changes that might be happening in the marketplace either in general macroeconomic senses or even in micro and competitive scenarios. Certainly we see aspects of fraud, which has been, you know, at the top of mind in the news of late, and compliance has already always been available. Certainly within the retail space are any other manufacturing or industrial type of organization. Managing of inventory has a strong emphasis on it right now and being able to have sort of just-in-time inventories and just-in-time fulfillment kind of go along with that. So, you know, tying the inventory and your logistics and fulfillment together is really top of mind. And you see companies like Coles and Walmart and Lowe's with, you know, shift the store upon orders coming in and so forth. And lastly, oftentimes what we're not really thinking about and quantifying is the lack of worker productivity. What happens when workers don't have the right information in front of them? What happens when that information might be inaccurate? Or what happens if it just takes a long time to bring that information together? So there's a wide variety of areas where, you know, data can flip through the gaps and cause some of the challenges that Gartner has talked about earlier. And so on the flip side, you know, when we turned to Forrester and took a look at what they were talking about, you know, that Lowe really brought up a very interesting point where, you know, too often you're looking at your investments in data and here he's talking about big data projects as more of a bottom-up type of approach where you're looking at a variety of data sources, them bringing them together. You don't always have an understanding of what you're trying to do or achieve with that data. The ultimate goal is this is information that people feel is important and at least we should bring it all to the same, you know, to the same place. But with a goal in mind, with an understanding of how that data is going to be used, that can actually change the shape of the information that becomes meaningful in the context of the types of insights that you're going to derive, of the types of decisions they're going to make, and of the types of outcomes that you're anticipating. And so when... Oops, sorry, hit the wrong number. And so what does this really look like? You know, one of the areas is a recognition of when you start with a bottom-up approach that is, you know, loosely or, you know, at least, you know, there's aspirations to be linked and improving the business, but in the end that doesn't actually pan out. And there's really a lost time to business ROI in that type of scenario. Go back a few years and think about how we made investments within our big data environment. And one example is a big box retailer who was standing up a Hadoop distro. And really it was about modernizing the data center and accelerating integration. It was how do you handle the volumes of information that are coming in and reducing the cost and, you know, avoiding those costs that come in from, you know, from a lot of the integration capabilities that are required or the resources that you're trying to bring in to go through that type of migration. And so in the end they might have saved on their warehousing costs by moving over to a Hadoop distribution. But in reality the business ROI that should have been achieved from it took a long time. It took 18 months before they were able to identify a use case within a specific business unit to be able to run even a pilot to understand what that is. They eventually found scenarios within, a scenario within loyalty analytics and being able to customize offers and certainly once those were put into place within a matter of weeks, those pilots were proven out and they were moving to scale. But in the grand scheme of things it took two years just to take that one pilot and put it into a mass production. So if you think about the fact of, you know, migrating over, you know, from a higher cost heavyweight system to a lower cost, more agile system like a distro and then having to go find your business cases, have you really proven out the ROI there? Or are you still, you know, kind of creating this pent-up business debt because you haven't been able to really roll out the capabilities that are available? And then when somebody comes back to say, well, where have you been spending your money and why are you asking for more? It's a really hard conversation when you say, well, I did it in the interest of improving the total cost of ownership within my data center, but in the end they're having a hard time seeing how that's relating to where their business objectives come. And so really what we've been doing is setting up pipelines that are really designed for our systems and not thinking about the people. You know, we talk about this idea of going from raw data into information all the time. If you really take a hard look at it, though, we are surrounded by information today. That information comes from our conversations, our relationships, our experiences. There's already a rich set of context around that information sharing and creation. But how do you harness that at scale? And that's when it has to be, you know, brought into our data centers and our systems, and we feed that. In that process, we have to deconstruct that information, creating a lot of metadata around that. We are disassociating information, particularly if you think about relational databases. You're carving up blocks of information and putting them into a structure that, while it has a certain amount of, you know, has a certain semantic element to it, it's still, you know, very, you know, very siloed. So we're breaking that down. And we might become even highly atomic in that information as we further decompose that and optimize that data in a physical state that our systems can process and be highly responsive to and have a high performance for when requests for that information comes together. As we process this data, that deconstruction, the disassociation, and creating this atomicness of our data strips away a lot of the meaning that was originally in place when we started with that rich data. And this has really been where the crux of our challenges have been when trying to handle data and getting the most value out of it. You know, we had a conversation recently with one of our financial services customer, and, you know, they really put it into great words, and I thought I would share this with you. You know, they talk about, you know, traditional tools requiring a lot of installation and set up time. It just takes so long. And, you know, how do you work on full, large volumes of data? You know, sometimes when you have to go through your analytics, you don't want to be sampling. You need to look at everything, and certainly if you're in regulated environments, looking at everything is required to accommodate any of those risks and the data that you have to take care of. You know, going back to what we talk about about, you know, going from rich information and processing, that software development life cycle is, you know, it's really sort of detaching our information, you know, our information and making it very raw. And it goes, you know, so you're writing a lot of code, you're doing a lot of packaging, and then running it. And, you know, it's creating an environment where you are disassociating even your business users from their data to the point where they don't necessarily feel all the time that they own their own information and that they should be accountable for it, and they're advocating that responsibility to you in the IT department. You know, and one of the things that's really kind of important to recognize here is the business has to work with the data. They have to see it. They have to play with it. They have to understand it themselves, put it back into context, and creating those walls and barriers make it a little difficult because they can't see it. And if they do need to get into it, they're not able to get into it quickly. So the time to query is very slow. Or even working on the masses of data and the volumes of data that we have today. Our systems really aren't designed or weren't designed to kind of handle those. And there's a certain amount of repeatability that they want. You know, if you're preparing data within an Excel spreadsheet, all of those data preparation steps are never even captured. And so every time you go back and go to another data set, you have to make those changes again and reshape your data. And obviously, sometimes it's not that easy to make changes to the data. Again, you're writing SQL scripts. You might be doing some Python scripting. There's a lot of coding you're building. You're building data integration workflows and things of that nature. And so there's no warm fuzzy feeling when it comes to data. And part of that is that we built systems in an era that could handle lower volumes and really was focused on transactional information. In our modern environment, we're really dealing with a different beast when it comes to handling information and being able to harness everything. So before we move on, I'm kind of curious, you know, as we've talked about some of these barriers and even some of the opportunities, like what are your biggest challenges when it comes to preparing data? So I'll give you a few seconds, about 15 seconds to kind of answer this poll as Shannon gets that up. I told you, Michelle, I don't have the option for that one. Oh, you don't have the option for that one? No. Okay. Well, then, all right. They were in the streets. But we'll keep moving forward. Don't worry about it. Not a big deal. We're flexible. We'll just keep moving on. So let's get down to this. How do we get a handle on our data? How do you cross that out? How do you align the way that you manage that information, the way that you use that information to the value that can be achieved? So I want you to think about your data management strategy and how you right fit this. And the way that I kind of look at it is across sort of four different areas. Certainly there's the amount of data being used. And what does that represent? It represents not only the fact that you have, you know, you have access to the information that you need, and it's the information that's meaningful, but also the fact that you have the opportunity to extend beyond that, so that as your environment changes, you have the ability to more easily incorporate more and more information. But at the same time, you also have the ability to dispose of information that while at one point it may have been meaningful and useful, it may not be that way anymore. And so, you know, access is a big deal. And most organizations today are really relying on only about 20% or less of the information or data that's in their, you know, in their system. And certainly while there's an eye to gathering more public and premium data, we're just in the beginnings of doing that. So for the most part, you know, data use cases are really defined by a lot of the transactional information that we have today, and that's manifested obviously within our BI systems or our transactional and application environments. So the other area that we look at is data interaction. You know, how accessible is this information? How available is this information? And is it easy for your business analysts and business users to get to the data and the frequency of how they get to that data? Certainly we also want to look at levels of accountability. Who really owns this information? Or who is taking the most responsibility and accountability for this information on a day-to-day basis? Is this your IT organization where they're really, you know, bringing in that transactional information and managing the day center and warehouses and the BI environments that build out, you know, large scalable, repeatable types of capabilities? Or is it business-driven where it might be, you know, having to look at something new or starting to incorporate other types of analytic capabilities? And in the end, the fourth area or fourth dimension you want to be looking at is that time-to-first value. When do you get to act upon that information and have an impact and an effect on your business with your customers and your ability to perform and be competitive within the marketplace? Now, why do I look at these four dimensions? I think what's really important to recognize is that oftentimes when we think about data, particularly in a governance perspective and when quality of that information has to, you know, really be focused upon, we start accounting for metrics that tend to look at only what is the completeness, what is the accuracy, what is the consistency of that data, those types of dimensions. We're measuring data hygiene. And measuring of that data hygiene acts as an indicator of will that data have value or not. But really the way we should be thinking about it is extend beyond the quality components. And I would say even extending beyond the IT total cost of ownership components, taking in the business context, the who's using that information and that business time to value, those are really the two dimensions that are often missing in the way that we're really quantifying that data. So in the end, if you see how I've mapped this out, as you start to use more and more data, as you have the business being more and more involved in using that data and getting their first eyes on it and their hands on it to shape it the way that they need, the interaction is obviously increasing and obviously that takes, you have a faster time to value. So the new types of capabilities that you want to introduce around exploration and discovery, a lot of the data mining efforts that you may be doing, or if you want to take steps into prescriptive analytics to make you more competitive on strategic and growth objectives, this is where you're going to want to start mapping out your metrics to help you understand are you going to attain those goals. So with those dimensions, what does that look like? I don't have everything here, but I wanted to provide some things to kind of think about at a high level, and certainly for your business, it's going to be a bit more unique depending on the analytic or operational types of activities you're going to perform, as well as the fact that you have different types of goals that are going to contribute to a lot of the business outcomes that you're measuring towards. So obviously first, at any top of the model, what are you trying to do, position towards a particular business outcome? What does that particular business outcome look like when we talk about first time, time to first business value, it's what is your upside? Did I improve my strategic position? So have I something like developing a new business model and extending a portfolio? Have I done something to improve my growth so the way that I've improved the value or my net promoter scores with my customers? The others, or is it a de-risking factor? Do I need to be thinking about how I'm solving against regulatory requirements? Do I need to be looking at where the quality of information is creating bottlenecks in my business processes or things of that nature? So always look at upside and the risk factors and don't just rely on the risk factors because even in instances where it is more of a strategic or a growth objective, there are still risk factors in that that you have to take into account. So include those as your outcome. The other side of it is maybe you do just want to look at increased productivity because you know by increasing that productivity, you're going to have business gains out of that. And so one of the areas that you might look at is are you able to go deeper with your data if you're able to improve the amount of time it takes to build a forecast and not have to spend a week on it but you can spend hours on it. Well, what does that do? That opens you up to being able to take a look at other angles and other factors that are influencing your forecast and even revisioning those forecast models to be much more accurate. And then lastly, there's things like you know you want to have better capabilities. It's sort of a green field. You want to introduce something like prescriptive analytics. You haven't quite done that before, had the capabilities to do that before. But that's where the investments are going to come from and you know that maybe in one area of your business you're looking to have those prescriptive analytics tell you like I'll give you an example, ConocoPhillips had done a project where they took a look at, you know, how long they could keep a drilling platform going. And in that scenario, by using prescriptive analytics they were able to extend their platform drilling time from three to four months. Well, they didn't know that they were going to be able to do that. They didn't know the amount of time that they were going to be able to extend that period and certainly that converts into the amount of additional revenue that they could probably have. But they knew intuitively that if they could incorporate better analytic capabilities which meant that they had to do wider sourcing of information to have more accurate and more accurate understanding of what would affect the expansion of that drilling time period that certainly extending their capabilities was going to do that and that's scaling intelligence. So those are some business outcome things that you can kind of take a look at. And then where are the metrics to track on this? Now obviously you're going to see things here that are slightly different that maybe you would look at a total cost of ownership model. You can incorporate things like what is the amount of data used. If you're not putting all of your data to use or a significant portion of your data to use why are you holding on to it? Why are you managing that information? What is the cost associated to doing that? Or what is the potential impact of not using that information and flying your business blind? So take into account that percentage and really understand if you're putting all your data to work for you. The second thing is how long does it take to understand that information? When you're working on greater volumes of data you're oftentimes having to look at it through different lenses. You're looking at different sources of data independently. You're focusing in on those as you're bringing some of those sources together. And that can take a long time, especially if you're trying to write a lot of SQL and Python scripts to start moving through that information. So how long is that taking? Or how long does it take because you're waiting on resources to support your business analysts to get them the data that they need? That's incorporated into that time-to-understanding as well. And time-to-understanding isn't necessarily time-to-first value, but it is a facet that you want to be including because it is a portion of that. Data lake adoption. How many data lakes are sitting out there? Or how many have you stood up? And they're just not really being used. Or is only one person using it in a small area of your organization? Have you developed a lot of pilots but they're not getting a lot of traction? Again, take a look at what that is and how long it takes from the time to implement to the time that it's ready for production to the time when you see not just a first-use adoption but that it is being incorporated into your over data and analytics capabilities across the organization. So really define adoption not because you could get a pilot going or you've got a single user or it's developing only one type of analysis, but how much further can that go? How are you increasing and decreasing resources? I mean, this is typically core to any type of cost management model or ROI model but really think beyond just your IT resources and how you're using those at a data engineering level or within your data center. Look across to your organization. There is a pipeline and a lifecycle of how data is brought into your environment and then moves forward to the point where data is actually being used. Oftentimes we forget that just because you make a warehouse or a Hadoop environment or a cloud environment available to your business users, that your work's done and all you have to do is just start accessing it. But there's often a significant amount of work and effort and resources that are required to take data through that last mile and that's oftentimes where the bottlenecks and the decrease in business value from your data is occurring. Oftentimes we look at our governance teams to try to improve upon those and again, they apply what are your data policies, what are your standards that you're going to use with the business logic around there, the business rules that go into the data and defining it across your data quality to mentions lifecycle dimensions and so on. But oftentimes data governance teams can't keep up with the speed of both the volumes of information, the varieties of information that's coming into the organization as well as the volume and variety of data requirements and requests or insight needs that are being accounted for. So certainly governance teams and stewards are there to begin and support those efforts but it doesn't scale. So really look wide and see how you're delivering complete end-to-end data capabilities and that certainly goes into this responsiveness because everything changes. If you look at competitive disruption in most markets today, it's happening quickly. You could say markets kind of mature within a five to seven year period and oftentimes that's being shrunk within a two to three year period where you see massive changes and our business stakeholders have to respond to those. If you look at the way that businesses are reporting on themselves and where Wall Street is putting pressure to have a clear understanding and transparency around what business decisions and business actions are being taken, certainly there's the quarterly reports that are going on but there is a lot more scrutiny that's happening on a week to week month to month basis and so that puts more pressure on organizations to be highly responsive, highly agile, highly flexible to meet those changing dynamics. And then lastly, look at the retirement, the opportunity to retire or decrease the technology service. Simplify those environments. Just as you're modernizing, look at ways that you're not introducing or over-complicating what you're putting into place. It still needs to be managed. You still need to have skills there and just a replacement of skills for a new technology isn't necessarily saving you and in fact it could even cost more because some of those skills are highly sought after and not many people may have them. So really look at the full investment that's being made in how you're supporting technologies within your organization even as you modernize. So putting this into context, we can look at an example at an auditing firm. They take a look at over thousands of analysts that spend a lot of time sort of cleaning, organizing, merging, really munging through a lot of information to complete their audits and the challenge here was they spend so much of their time trying to get to the basic information for an audit. Sometimes they can't really go deeper and why does that become a problem is because if you can start offloading some of the time that they're spending to do the cleansing, you can actually increase the revenue capacity to 27% and why is that the case? Because they are able to go deeper. They're able to profile. They're actually able to perform deeper analytics that can identify areas that are at risk for organizations and they can, you know, extend their service offers from that perspective. So that's one way of kind of looking at what you can do to, you know, build an ROI case. Now that being said, at PAXADA what we've done is built, you know, sort of a rudimentary calculator to help you get started. And again, this is by no means incorporating all of the different facets that you might want to put into an ROI calculation, but at least it can give you a foundational baseline and you can add additional aspects that make it more relevant to your organization. What you'll notice in the ROI calculator is two things. One, it's divided between what's happening within the IT organization but also what's happening within the business and the business analyst community. Why is that important? Again, because ROI has to take into account both sides of your, and a holistic perspective of your organization and the way that it manages data. And as I talked about before, it really is that end-to-end model from bringing data in to the end point of where that data is being used and value is being received. So really what we've done is we looked at, you know, a few things. Certainly the number of people that are being, you know, that are supporting these types of capabilities looking at their hourly rates and really understanding how much time it's taking them to prepare data or munch through that information. Doing that on the other side, again, for the business analyst community. So basically what we're doing is looking at sort of time and resource factors here and understanding the cost that goes into it and then taking a look at, also, is an analytic project directly associated to revenue generation. And this is kind of important because this kind of balances back out the fact of, you know, are you trying to contribute to more of that upside of your organization or are you trying to de-risk? And so we can take that into account and then you can put into place anticipated, you know, contributions of revenue and being conservative. So a lot of the numbers that you're seeing, are fairly conservative numbers that I could see at a company and certainly depending on your region, it's going to change a little bit, particularly around, you know, your hourly pay rates and do you account for that if it's fully loaded or not or things of that nature or, you know, a conservative figure of what an analytic project will do to generate revenue. But what you could start seeing here is it's calculating out the technical and non-technical resources, the number of hours that's being put into this, you know, over the course of the year and the annual costs that's associated to the work that's being done. So this starts to give you a little bit of a baseline in terms of resources and what does that translate into? Well, let's take a look at these headcounts. You can start to see, you know, in environments where you're using Paxata to prepare the data. So, and really what does that incorporate? That incorporates people that are self-service enabled to access their personal data, which may reside, you know, in spreadsheets on their desktop for pietary information, which is, you know, data and information that is within the organization, public information that sits outside, as well as premium information that might come from places like Dunn and Bradstreet or Bloomberg's or some other type of industry source that is selling data service or any IIS type of environment. And so comparing that self-service capability to access, prepare that information, and then make it available and publishing that out for, you know, analytics or operational types of use cases to what happens in a traditional type of pipeline development where, you know, data engineers are building, you know, data marts and data integration pipelines to land information that you could put some analytic tool on top of but also taking into account. Maybe it's not fully prepared in the context of how those users are doing that. And so what you look at is sort of this recovery of, you know, having Paxata in place and helping in those areas in the model for what I put in it could actually, you know, save you equivalent to 16 man years or approximately, you know, 1.6, sorry, a little small over there. But you can see where the cost savings kind of come in there in terms of, you know, creating more efficiency with your employee resources. Now, next, how can you add resources? We're always kind of taking a look at, like, where can you free up resources but also, you know, where can you find additional resources to do other things and in the same context, looking at, you know, using Paxata versus traditional methods. You know, with Paxata, you can actually realize a 31% increased head count availability. And what does that mean? It means that those hours that were spent doing, you know, more technical work or waiting for information or doing a lot of manual efforts where Paxata is starting to assist in that data preparation. You can, you know, those resources can either be applied to go deeper on the types of things that they're doing or take on more types of analytics or initiatives within their organization. And really what that tells you is, in your ROI model, as you identify where resources are freed up, account for the fact of where those resources can be reallocated to either take on other projects that have new ROI or new, you know, contributions to the business or where they're helping out, you know, in just making things better. So that's, you know, another factor that can go into your overall return on investment. Sorry, just want to make sure. And then lastly, you get back into productivity. How productive is your team? How much more can they take on? Where does capacity come in and where does that contribute to some of the number of the existing projects where you are going deeper? And in this perspective, you know, the capacity has increased 23%. So, you know, you get a little bit of an idea at this point about how even three basic areas in terms of, you know, productivity-added resources or the number of resource hours can actually be incorporated not into a baseline ROI model, but you can start linking these out to other types of capabilities and initiatives and projects that you have in place. And so, you know, just to kind of sum this up, you know, as we've had, you know, conversations with some of our customers, I think this is the quote that I've always really liked from one of the Chief Data Officers at a large financial services firm, a banking and credit card firm said, you know, prior to Paxada, we struggled with cumbersome data processes that were impossible for us to audit or automate and our only approach was to just throw more bodies at the problem. And I think that that is pretty typical of what you've seen in most organizations and that even in many of the self-service environments, there's still a lot of coding that needs to be done. There's still a lot of heavy lifting that needs to be done by those business analysts and even the data science community. And I think what Paxada is really trying to introduce is a way to improve upon that and make those analysts and business stakeholders much more productive in the way that they handle data and, you know, create a relationship with their data, too. So how does Paxada look at this? Well, we look at it in three ways. First of all, we take into account that there really are three primary stakeholders that are focused on creating, you know, great data for the organization. And that's obviously your business analyst, the one who's actually, you know, building the insights. And what do they want to do? They want to engage with the data themselves because that's helping them do their jobs, do their jobs better, and do their jobs faster. And for every question that comes up or every, you know, assumption or hypothesis that is being made, they don't always know exactly or can't exactly articulate the way that the data should be, but they know when they're looking at it and they know when they can interact with the data when data looks great. Now, they're working for the heads of analytics or a digital analytic officer or sometimes you're hearing the term chief analytics officer coming up and these are the people in your organization who are thinking bigger. They're thinking about how do I build insights and how do I use insights to be an insight-driven business? And so they're building these centralized as well as decentralized competency centers to democratize insight and, you know, they really want to power the business. And so what they're really addressed to is, you know, how do you deliver meaningful information to drive your business outcomes and where Paxada is helping them is to provide a unified platform that any business analyst or business user can use to access, prepare, and publish, you know, data to the business, but at the same time give them independence about what are the analytic tools that they want to use. You know, are they using R? Are they using FAS or SPSS? Are they using Tableau or CLIC, or are they, you know, working within a BI environment like a Cognos or a Business Object? And in those scenarios, there's still a lot of freedom to operate within the analytic and business applications that they're used to, but they have a unified and collaborative platform where each can harness each other's works and collaborate and ensure that they're building upon, you know, upon the tribal knowledge of each other to bring forth the information that, you know, makes the business better. And in the end, there's also your chief data officer that, you know, there's been a lot written up on chief data officers and where they come and how they're effective, but at the end of the day, they're really the stewards of the information to say, I want to ensure the organization is enabled to get the most value out of their data and do that at scale. So they're really responsible for a lot of the modernization, not only of the data systems, but also around the operational processes to ensure that the business gets the most value from their data, and they often work very closely with the chief analytics officer and the enterprise or information architecture teams and the data engineers that are developing the environment that the analysts and the analytics teams are going to use. And so Paxata really is trying to serve these three customers by providing an application environment that is user-friendly, intuitive, visual, and flexible for what a business analyst needs, provides a platform that a chief analytics officer can, you know, have anybody on her team use and collaborate with, and then the chief data officer that can really get the most out of the Paxata, you know, as a platform to serve a lot of the, you know, the broader business needs and data services and pipelines that an organization requires. And so the Paxata Business Information Platform is really, you know, designed to address, you know, needs around data that are supported maybe at an infrastructure level for scale within the IT organization, but also at scale and repeatable for the business organization. And at the same time, anything that's happening from the individual business analysts and the analyst community where they're imparting, you know, a lot of iterative exploration and discovery within, you know, with the data, doing ad hoc types of analysis, that is captured and can be put into a repeatable, scalable type of environment. So really what Paxata is doing is allowing data to be prepared by the business and being scaled for the business. So it's enterprise grade and it's business grade. And what does the architecture look like? We really see it in a variety of layers. Certainly, you know, taking into account first who is going to use this environment from developers to data scientists to analysts to the general information worker, we really design the system as an application, having an application that is relevant to the types of use cases where the platform will be used. So it could be BI and analytics, but certainly supporting some of the transactional requirements, data marks, data as a service, and other types of custom apps that come into place. And then at the core is really where the platform excels with all of the comprehensive capabilities that you would expect within an information management environment, integration, quality, enrichment, governance, and collaboration with the security and administration that goes along with this. And really the secret sauce is around the intelligence and semantic cataloging and library that is available within the system to both apply and prepare the data so that it's contextually appropriate, as well as taking, you know, the subject matter expertise of the analysts and the data scientists as they're imparting, you know, the business nomenclature and context back in. We automate that, we have a connectivity framework that allows you to, without of the box connectors, just ingest information, prepare that information, and then push that back out into repeatable services, setting those up as pipelines and scheduling those. And in fact some of that can actually be done by the business analysts themselves so it's continuously refreshing the answer sets. And then really the ultimate secret sauce is the fact that we are a Spark application and Spark architecture. We don't just run on Spark, we have a fully integrated capability where we are able to, you know, granularly manage Spark to account for a wide variety of performance requirements either in batch, micro-batch, or more real-time to create the most interactive responsive system as you're preparing your data. And maintaining that data in a persistent state, as well as we sit on top of a Hadoop environment and making that available in any type of environment, whether that's traditional big data like Hadoop environments, cloud on-premise, or hybrid types of scenarios. So with that, some key takeaways when you're thinking about building your ROI model, account for the fact that it should be tuned to at least one of these four capabilities on the upside and the risk factors for strategic growth efficiency and risk metrics. You want to look at metrics that account for accessibility, interaction, accountability, and time to value. So take a look at the quadrant that I showed you earlier in the deck. Account for the fact that business outcomes are personal, operational, and organizational, that your outcomes are going to affect different areas of your organization, which means that the ROI foundation is going to be slightly different. So if they're more operational, it could be down in an IT total cost of ownership. If it's personal, it's the productivity and effectiveness that you're getting out of your workforce. And if it's organizational, it's linking back up into the broader strategic growth efficiency and risk factors that you take a look at. And then lastly, build an information platform that translates the tribal knowledge into organizational IP. One of the biggest challenges I think that we have today in how we've traditionally managed information is thinking that we can know all of our requirements up front that everything can be pre-built. And we are just operating too quickly as a business with the volumes of information coming in and the variety of information coming into our organization, where that's available to us, really puts too much strain and pressure on our existing information management practices. And we really need to look at new ways and methods to support how we get the most out of our data, the highest value out of our data, and that requires that we think about more of a citizenship model where information isn't just democratized, but the ability to manage, be responsible, and accountable for that information is also a democratized experience. So with that, I just want to thank you for taking the time today and joining us for our presentation on how you can calculate the cost of data. Certainly, come to our website and special free demo. We would love to be able to share with you the PACSBOT experience and the business information platform and what you can get out of it. And with that, if I can open it up within the last six or seven minutes and see if there are any questions that I can answer. Hi, Michelle. Thank you so much for this fantastic presentation. We certainly have questions coming in. Of course, one of the most popular questions that we receive are questions about receiving the slides. Just a reminder, I will send a follow-up email by end of 8th Thursday with links to the slides and links to the recording of the session, along with anything else requested throughout. So just getting down to it, you know, that when you were asking what are the biggest challenges when preparing data, a comment came through saying business users not understanding the business or business process. Do you want to make any comment to that? So let me see if I understand this. So one of the biggest challenges is that the business users don't understand the process. Like, how do they go about preparing their data? Is that it? Okay. Yeah. So there's two sides to that. How well have you communicated and trained the organization to know how they can get access to their data, what resources are available to them, who they should contact, and so on? In large organizations, that's certainly an effort. And your governance team should really be thinking about how to do that. And what are the support systems out there? What's the hotline that they can call or email to, or, you know, if you've got some sort of a task management system leveraging that, you know, set up Slack. That's what we do. We have Slack. So anything happens, somebody's like, you know, somebody's out there. But on the other hand, if it is a bottleneck, I think you also have to question what don't you understand in the way that your business users and analysts want to interact with the data or what are their needs and requirements to access it. And really balancing an understanding of when they're too shady in terms of, and I use that term loosely, but too shady in the way that they know what they want to immediately get out of it versus they have hard requirements. And, you know, I say, there's really this 80-20 rule out there today where 80% of the time that a business user or a business analyst wants the data, they have a general understanding about what they're trying to figure out, but they don't know the exact questions. They don't know the exact data elements that they want to incorporate. They may or may not have an idea about the exact data sources that they require. And so I think organizations really have to consider how to adopt capabilities for this more fluid, iterative discovery process in the hands of those that are less technical versus, you know, what we've done with data scientists, which is really give them the keys to access what they need and their coding and scripting and they're fairly technical in and of themselves, but I think that same self-service requirement is necessary for your business analysts and users as well. Sure, and we just have a few more minutes, so let me see if I can sneak in a couple more questions here. Do you have an example of data costs ROI from a manufacturing business? Ooh, I will need to get back to you on that. I think, you know, one that comes to mind, and this isn't, and honestly, this is coming a little bit more from my forester days rather than my packed father days, although I'm sure we have something we could probably provide in that realm too, but I thought it was really interesting in a conversation that I had with Lauriel where they were looking at master data information as it related to their production lines and they, because of the automation and the data used in the production lines, if the master data wasn't accurate, it could cause issues in the ability to create the cosmetics and so on. And so they were really linking up, you know, what the information needed to look like at different points within the production processes so that their production lines didn't go down. And that, you know, to be honest, that was done a little bit more in sort of the old-style MDM, you know, requirements, standards, policies, type of framework, governance framework that was done before, but I think where data preparation capabilities start coming into play today is that you can be much more agile in doing that because you can have the business users who really own that data, even, you know, the product manager or the merchandisers who own that information and have all of that product data available immediately go in with the data preparation tool, look at the data that is speeding the plant line and then coming back out and being able to rapidly change and update that information and immediately deploy. You know, you shorten the SDLC time to making those changes, you know, kind of do a little bit of a run-around on the traditional MDM capability, but you do it faster and at a lower cost and you start opening up your lines a lot, a lot, you know, a lot more than you would. So that was one of the things that they did and they definitely had a significant reduction in production line downtime once they started putting those things into place. And like I said, that's pre-sort of data preparation but you can see how the two could actually kind of come together and they could, you know, have a better outcome with newer capabilities. Well, that brings us right up to the top of the hour. Michelle, thank you so much for this great presentation. Just, again, to remind everyone, we will be publishing the recorded webinar on slides to dataversy.net within two business days and also to follow-up e-mails up by end of the day Thursday with links and other requested information. And thank you, again, to Pec Sada for sponsoring today's webinar. As always, a big thank you to our attendees for being so engaged in everything we do and for your great questions. We just love it. So I hope everyone has a great day. Enjoy. Again, Michelle, thank you so much. This was great. Great, thank you too. Thanks to everyone. Have a nice afternoon.