 Hello and welcome. My name is Shannon Kemp and I'm the Executive Editor of Data Diversity. We'd like to thank you for joining this most installment of the Monthly GAMO International Webinar Series. This Webinar Series is designed to give our Enterprise Data World Conference attendees education year-round. This month, John Evans will be discussing the theory of everything. Is it time to rethink data management? Just a couple of points to get us started. Due to the large number of people that attend these sessions, he will be muted during the Webinar. For questions, we will be collecting them via the Q&A in the bottom right-hand corner of your screen. Or if you'd like to tweet, we encourage you to share our heads or questions via Twitter using hashtag GAMMA. If you'd like to chat with us or with each other, we certainly encourage you to do so. Just click the chat icon in the top right for that feature. As always, do us on the follow-up email within two business days containing links to the slides, the recording of the session, and additional information requested throughout the Webinar. Now let me formally introduce today's speaker, John Evans. John is an information strategist, self-confessed data quality geek and the founder of Equalian, an independent UK consultancy practice specializing in enterprise information management. For the past two decades, he has been helping organizations harness their information and transform it into a strategic business asset. As a regular speaker and panelist at industry events, John enjoys bridging the gap between the business and IT domains, bringing fresh understanding and clarity. The same approach he adopts as a respected information management coach and mentor. With that, I will turn the presentation over to John to get us started. Hello and welcome. Thank you, Shannon, and thank you today for giving me this opportunity to speak to you today. Just one second. Let me just get some focus here on the screen. There we go. Okay. I'm John Evans, and as Shannon explained, I'm an information strategist. Basically, that means I help organizations get smarter with data. It's something I've been doing for quite a few years now, and of all the areas I get involved in, I have a particular passion for data quality. That's why I tend to refer to myself as a data quality geek. I'm quite a fashionable word nowadays, so I don't mind that one bit. As Shannon mentioned, do please tweet throughout the talk or after the talk. Let me know what you think. My Twitter handle is mad about data. I look forward to receiving some messages from you. Just a little bit about my company. As Shannon said, we're based in the UK, and we basically help organizations with their information management challenges. Entirely independent. We very much help organizations find a steady path through the maze of information management. Okay. What are we going to talk about today? We're going to talk about the theory of everything. I'm referring to the movie. The movie is a true story. It's about Stephen Hawking. He's a British cosmologist and probably one of the greatest scientific minds we've ever known. If you haven't seen the movie, it's basically a story of love, courage, triumph over adversity. It follows Stephen's quest for the theory of everything. What is that? It's a single unifying equation that explains everything in the universe. All of this amongst intertwined with his battle with motor neurone disease. It's absolutely fabulous. Eddie Redmayne gives an excellent performance. If you really remember one thing from today's talk, it's to watch the film. Promise me you're going to Netflix later today and enjoy two hours of pure magic. One of the things that really struck me when I watched the film, which was a few months ago, was Stephen's courage. It's courage not only to battle his own illness, but it's courage to challenge the accepted wisdom, even his own findings from his own research, and think differently. So that's really the subject of today's talk. Today's talk is about thinking differently. I guess because I've been in this information management industry for quite a few years, I think I've started to question some of the things I used to take for granted. So let's start with an easy one, data or information, which is it? These are terms we use every day. Not only do we use them in our professional careers, if you work in this field as I do, but we use them in our everyday lives. They've become such a part and parcel of the everyday language, but we rarely question what we actually mean when we talk about data or information. Now, there's really two schools of thought on this. So if you look at the industry we're working in, in the blue corner, we've got the likes of Data Diversity, who've invited me to speak today, probably the most foremost educational research organization and educational organization for data. We've got the likes of Daima International, a wonderful body that's done a huge amount in the area of data management over the last 10, 15 years. We've got Enterprise Data World, probably the biggest data-oriented conference in the world. We've got the likes of Data Blueprint, you know, thought leaders very much pushing forward our data management industry. We've also got those who tend to use the term information, and I'm one of those. So my company, Equilion, our tagline is Information Strategy. In the UK, we've got the Information Commissioner's office. Now that, if you don't know, is the body that polices our data privacy law over here. And we've got various conferences with information in the title. We've got publications across the world that use information in their title. So we've really got two schools of thought. We've got the data files, and we've got the infoholics. Now, this battle's been going on for a while, but it's not really a battle we talk about. People just accept that there are two words that we use. But it doesn't stop there. You're going to have to look at some of the things that we talk about when we're trying to develop our information and data capabilities. The terms are interchangeable. We often talk about data assets. But then again, we might talk about information assets. We might talk about data quality, information quality. Pretty much any term that you might use data with, you could also use information. And they're all equally accepted. So really, is it any wonder our stakeholders are confused because we don't really define what we mean by these terms? So if I think about the way I've explained this to my clients over the years, I've talked about the data value chain, and this is it here. I've talked about how we've gone from raw data through to refined decisions. And the way I explain it is like this. Data is a digital representation of objects and events. It's our raw input. It's the raw materials we put into our businesses. Information is data that's been collected and organized. It's what we tend to call data in context. Knowledge is derived from this information. It's the understanding. It's the interpretation of information. It's the insights we gain. And our decisions are based on knowledge. So the decisions or the judgments we make based on our acquired knowledge, it's the outcomes. And one thing's for absolutely certain, the quality of our data affects the quality of our decisions. It's something we've talked about and something we've tried to get the message across for so many years, rubbish in, rubbish out. And we accept this. And this is the way it works. This is our data value chain. I'll give you a very quick example. So three simple numbers. This is our data, 2162016. If we turn that into information, it might be that today's date is the 21st of June, 2016. The knowledge I gain from this is my summer break, it's only a few weeks away now. And the decision I make is I better start organizing myself. I better start booking my flights and hotel. It's a very simple example. Let's look a little bit closely at data versus information. We said that information is collected from data. So what are they? Well, data is the raw materials. It has limited context. There's no real relationships because it's basically unrefined. It gets turned into information when we process it. It creates meaningful context. And we start to define the relationships. We have explicit relationships in now. So this is all very simple. So data, so information really is data that's been through the career processing to make it more useful. It's simple. But it does raise this question. At what point does data become information? We've shown it here as a transition from the two. There must surely be a point where suddenly data is no longer data, it's information. So let's have a quick look here and see if we can work this out with a few examples. So let's imagine a file containing raw cells figures. Now, which is that? Is that data or is that information? Well, it feels to me like that's probably data. It's raw. It's unprocessed to a certain extent. It kind of feels like data to me. Now consider a formatted report that compares cells performance across different territories. Now that's been through a certain amount of processing to create that comparison. It's been refined. Surely that's information. So what if we had a formatted report that just lists cells figures all the way through a tree? Now, is that data or is that information? It's kind of one or the other. It's had a small amount of processing to it. But is it really information? Is it going to help us make decisions? Well, perhaps. But we're probably going to have to use a little bit more of our own intellect to help us make the decision. Not only the information provided to us. So it's kind of somewhere in the middle. So actually, this happens all the time. We have a data continuum. We have everything from pure raw data through to highly refined information that helps us make decisions. And it's not a simple step change. It is a continuum. There's a gradual progression from data to information. But of course, there's two different perspectives on this. From the machine's perspective, it's all just data. To your laptop, to your mobile phone. To your desktop computer. It's all just data. Just some of it's been a bit more process than others. To a human, it's all information. It's just that some of it is more useful than others. So these two different perspectives are about the same thing. We have the machine perspective and the human perspective. So I think we need to start thinking about this a little bit differently. We have these kind of two parallel coexistent worlds. We have the data world. And we have the information world. Now machines operate in the data world. All they really understand is data. So we use them to store data so we can readily access it as information. But humans operate in the information world. When we see data, we can't help but turn that data into information. We do it subconsciously. Our brains interpret the data we see and then that interpretation, it becomes information. So really, information is simply the human interpretation of data. It's as simple as that. These are two terms for the same thing. As soon as it data contacts humans, it becomes information. Our brains are forest. So I hate to break this to you, but to a machine, the sophisticated cells report you spent so long developing is just data. And don't forget, to a human, even a small set of cells figures, it's still information. We can still make use of it. We might have to do a bit more ourselves, but it's still information. This question was posed quite a few years ago in the 18th century, a philosopher called George Berkeley asked a question, if a tree falls in a forest and no one is around here, does it make a sound? Now this question is still around today. People still ponder over this. It's sound really just a sensation in our ears as opposed to the vibration of the air. So I like to pose a slightly different question. If data is stored on a computer and no one ever sees it, is it information? Actually I don't think it is, because I think whilst it's on a computer and it's not being used, it doesn't really have the impact of information. It doesn't help us make decisions. It's just that there. It's a bit like the tree in the forest. If the data never actually reaches us and we never have the opportunity to do something with it, it's just data. Let's return to our data value chain. Let's have another look at this. Let's put a human in the picture. Let me put our human in the picture. A human deals in data, in information. And so really we're looking at an information continuum. But data and information aren't separate entities. It's all information. It just progresses from its raw form to a refined form. And similarly, in the machine world, we have the data continuum from raw, unprocessed data to highly processed, refined data. And of course in the age of artificial intelligence and machine learning, machines are starting to gain knowledge and starting to interpret the data more. They're starting to make decisions. So actually we have these two continuance. We're just looking at this from different perspectives. From the human perspective, we're dealing in information and refining it. We're using it to gain knowledge and make decisions. And more and more, our machines are doing the same, but they're dealing in data. So where does that leave us? So we've still got these two schools of thought. We've still got organizations that call themselves data organizations. We have organizations that call themselves information organizations. So do we need to change our vocabulary? Do I need to speak to Shannon and convince her that really she needs to be calling her organization infoversity? Or maybe she'll say to me, John, you need to change your tagline mate. You need to be called Equalian Data Strategy. Right, so then we need to do either of those. I think we just need to agree that we're talking about exactly the same thing. We're just talking about it from slightly different perspectives. So what about unstructured data? Everything we talked about so far is what we traditionally call structured data. So what in is unstructured data? Data that has no structure? Well, surely all data has some structure. Otherwise it's not really data, is it? Oh, okay, but how about documents and reports? Well, in my experience, all documents have a certain amount of structure to them. And even the language, even text has structure. Language follows the structural rules of grammar, otherwise it wouldn't be able to read it. Okay, so how about audio and video? Is that unstructured data? Well, maybe, but surely media files still have to conform to some standards. They still have to have some structure. They still have to contain some metadata to allow us to make use of them, to allow us to play them. So they can't be entirely unstructured, can they? So maybe we're looking this wrong. So maybe let's turn the argument around. So let's think about what is structured data. If we can't really define unstructured data, let's at least try and define structured data. What is data that's stored in a relational database? That provides the structure. Well, maybe in the past, but there's so many different ways, so many different approaches of storing data nowadays. I wouldn't necessarily describe them as unstructured. So it looks like we've confused matters again. We've got our terminology confused. We're struggling to really understand what it is we're dealing with. So let's try and understand this by looking at another example. Let's look at a database table containing monthly sales figures. Well, is that structured or is that unstructured? Well, it kind of fills to me like that structured. The table has structure. Every row in the table contains the same fields. It's predictable. It tells us we know in advance what the data will look like because we've defined it. Okay. So how about our monthly board report, which shows the same table, but on page six? So it's the same information there. It's exactly the same sales figures. It's formatted in a table, but it's in a document. There's maybe charts. Maybe there's images in there. Maybe there's a lot more than just that table. So this kind of feels like this. It's more of the unstructured world. It might be a PowerPoint. It might be a Word document. In our traditional language, it's unstructured. So what about an extract of the sales figures as an Excel file? Well, it's just the same data isn't it? It's just the sales figures, but it's not in the database anymore. It's in an Excel file and it's a file. Maybe we've added some formatting to it. Maybe we've created a bit richer information within that file. It kind of starts to feel like it's moving towards the unstructured world. So maybe that's somewhere in the middle. So again, we have this data continuum. We have firm structure all the way through to fluid structure. By firm structure we mean predictable structure. Structure we know in advance. Fluid structure is a richer structure. We may know in advance, but it might contain far more to it than purely sort of tables and figures. So we have this whole continuum of data that kind of explains our structured, unstructured question. But really those terms aren't very helpful talking about structure and structure because all our data has some structure. It's a continuum. We have to think about it in different terms. So let's look again at this particular continuum with our different versions of our sales figures ranging from our firm data to our fluid data. But then let's overlay on this the other access. Let's look at our raw data to our refined data. Now I would say that our monthly sales figures kind of fits somewhere in between raw and refined their monthly sales figures. They've been aggregated. We've kind of done some processing to them. So they're not purely raw, but they're not actually that refined either. So they're somewhere in the middle, this continuum on two different axes. Now if we think back to where we started and we looked at the other data, we had our file containing raw sales figures. Well, that's raw. We've not done anything with it. But also it's firm because it's a very fixed structure. Then where would our formatted report that looks across different territories that sells performance? Well, that's quite refined. As we said before, there's been some analysis done. It's been processed. But also because it's a formatted report and it may contain other information in there as well as the figures, it's a bit more fluid. So this whole continuum, this is just different areas. We can place on this chart different types of data according to how raw or refined it is, how firm or fluid it is. It's a continuum. So let me ask you this question. Of all these five pieces of data on this chart, which should you want to avoid falling into the hands of a competitor? Well, actually, I think you probably want to avoid all of them falling into the hands of competitors. Effectively, they all contain your sales figures just in different formats at different levels of aggregation, but they all need to be protected. This is the important thing. They are all valuable assets, but only assets that can give you valuable assets that can also introduce risk. So really, there's no difference between these. They're just different ways of presenting the data. So let's ask the question. Is there such a thing as unstructured data? Well, actually, we're just managing a continuum of data. Well, maybe we should ask a couple of our employees. Now at this point, I'd just like to say if there are any records managers out there, this isn't intended to offend. This is a caricature to demonstrate a point because Betty is our records manager and Betty's been with the company a long time. So Betty looks after our company's documents and files. She doesn't really get involved with data. When she joined, we kept lots of our information on paper, and to be honest, we still do. So she quickly learned the importance of only keeping the stuff we really need. Otherwise, her desk quickly became an absolute nightmare. She knew we had to manage our information well. Now Bill, he's our database administrator. He looks after our company's databases. He doesn't really get involved with documents. When he joined the company, we only had one server, so he didn't really keep a close watch on this place and make sure we never ran out. But to be honest, nowadays, our best is so cheap. If we run a storage, we just buy more. Oh dear. I think we need to think about this differently. It is time to think differently. We actually have two different cultures in our organization. We have cultures that have been built up around our traditionally unstructured data, our documents, our records management. But we also have a culture built up around our data. Our software engineers and our database administrators think primarily in data terms. Structured data terms. Now, these two different cultures, these worlds rarely collide. They are very different parts of the organization. And unfortunately, this is the cause of one of the problems. We've created an artificial and unhelpful separation between structured and unstructured data. It's a separation that doesn't really exist. It's something we've created historically. So really, what we need to do is we need to bring these two worlds together. We need to cross-pollinate our skills and knowledge. We need to make sure that all the good practice that Betty has learned over the years about maintaining good unstructured information moves into the structured world. We have to make sure that all the good practice that Bill has learned about managing our structured data is brought into the unstructured world. We need to break down that barrier. We don't need to think in structured and unstructured terms. We have to think in data terms. That's the important thing. We have to bring our best practice together. Because just because storage is cheap doesn't mean we don't need retention in archiving strategies. We do. We do. The size of our databases are growing exponentially. We have to keep a handle on it. We have to understand the value of information and data. We have to understand what we keep and what we don't keep. But also just because we're using SharePoint for all our documents, that doesn't mean we can do without standards. We still need to define our document standards. We still need rigor. We still need processes. We still need all the good stuff that translates into good data management. Okay. So let's come back to our value chain. Our slightly revised value chain. Now, what do we really mean by knowledge? Well, when I described it earlier, I described it as the understanding we derive through interpreting information. And that is one definition of knowledge. But also the other definition is the information and skills we acquire through experience education. Now, what does that mean? Well, really, it's the know-how within our organization. It's the learnings that tell us how to run our business. Some of it's documented. Quite a lot of it. It's typically undocumented. It's a tacit knowledge. It's the stuff in people's heads that help them make decisions, that help them work day in, day out, help them do their job, help them understand how the business operates. It's methods, it's procedures. It's our best practice. It's even things like patterns and trade secrets. This is all the knowledge within our organization about how we do stuff. So here's a question. How well do you manage the knowledge in your organization? Well, if your organization is anything like the ones I've worked with over the years, it's often the poor relation of data management. And why is that? Well, the sorts of things I hear are, well, it's all in people's heads, so there's nothing to manage. Well, that's nonsense, isn't it? The fact is in people's heads, we have to get it out of people's heads. We have to turn that tacit knowledge into explicit knowledge, and we have to manage it. Or we've got a folder somewhere that contains all the procedures. Well, that doesn't sound to me like good data management. Or we don't have a formal approach. Everyone just looks after their own stuff. Well, if that's the case, that's just anarchy. How can we ensure that the knowledge that we've acquired in our businesses, our staff have worked so hard to produce, it lasts, that there's a legacy there that when we lose people from organizations, that knowledge doesn't disappear with them. What can we do? Well, let's just manage it in the same way as our information assets. End of the day, it's information. It's know-how. It's the documents that tell us how to do stuff. So really, our information continuum is just larger than we think. It's not just the data about last month's sales figures. It's the document that tells us how our marketing strategy is going to work. It's all the stuff that helps us. Largely, our knowledge is unstructured because it's procedures. It's process. It's best practice. But we've already described the fact that there should be no difference between structured and unstructured data. It's just continuum. So all we're doing here is expanding our continuum further. We're purely going to manage all our stuff in the same way. It's all information assets. It doesn't matter whether it's sales figures or whether it's our marketing strategy. It's an information asset. And of course, that same argument translates into the machine world, into the data world because more and more our machines are gaining knowledge, artificial intelligence, and more and more they're making decisions. And so actually, this continuum is expanded not only in the human world but also in the machine world, in the data world. So probably now is the right time to ask about big data. Where does big data fit into all of this? Well, firstly, we're probably going to start by defining big data because so many times people talk about big data. Do they really know what it is? Okay. So it's a term used by IT solution vendors to convince unsuspecting organizations that their systems will soon self-destruct into a gazillion ones and zeros. Okay. So I'm being slightly facetious there. This is actually a definition from the urban data dictionary, which is something I created a few years ago that was designed to amuse and entertain when people got bored of talking about data management. So let's look at a more sensible definition. So big data is information in raw or processed form that has the potential to deliver significant benefits to organizations to invaluable business insight but requires careful management and a formal approach due to its size, frequency and diversity. So that sounds a pretty fair description of big data. But isn't that description just data? I could have used that description for data many years ago before anyone even thought of the term big data. Data by its own definition has complexity, has variety. All these things aren't new. It's just data. So let's go back to our continuum. Now where would big data fit on this? Well, let's think about the sorts of things that we would generally categorize as big data. So let's start off with all our connected devices, our Internet of Things as we now call it, which could be our mobile phone. It could be our Fitbit on our wrist. It could be our refrigerator that we'll order more beer when we start to run low. That's the kind of fridge I like. So that's kind of firm data in that it tends to be that the data produced by these devices is relatively fixed. It follows formats. It has very well-defined structure to it. So it's firm. It's also typically quite raw. It's like telemetry. It's just measurements. We have to do stuff with it to make use of it. And how about all the human-generated data, all of the social media, all of the tweets, all of the Facebook posts? Well, that's far more fluid. There's a lot of textual content in these, and there might be media content as well. It's a far more fluid type of data, but it's also quite raw. It's of normally people's opinions. There's not a huge amount of processing done to that when we receive it. So of course our job is to take these sources of data and refine them. If we want to make use of them, we have to do something to them. We have to process them. We have to analyze the data from our Fitbit to work out what it's telling us about our health. And when it comes to our social media data, we also have to refine it, but maybe you also want to create more structure from it. You want to try and turn what is typically quite fluid data into more firm data. But essentially these points just exist on our data continuum. They are just types of data. Now of course I've shown the data continuum in two dimensions, but we can have as many dimensions as we like. I can't draw many, many more than two. I could possibly draw three at a push. But we can have as many as we like. We can include volume, velocity, variety. If you want to take the three typical Vs that we use to describe big data, we can have any others. We are just talking about points in space here and all data fits into our continuum. We draw our continuum in the right way and that's an important thing. So I think it's time to think differently again. Ultimately it doesn't really matter whether our data is big or small, firm or fluid, or refined. In all cases we need good definitions. We need to understand what we're dealing with. This data effectively is just data. If we don't define it, we can't understand what the data is and we can't make use of it. We need to manage it. We need to manage it from the point where it's created to the point where it's destroyed. We need processes for that. We need good processes. We need to use techniques to exploit it, to derive the value from it, to refine it further from the state in which we receive it. And these needs to be sophisticated. If we're going to combat the growth of data and actually gain value and eliminate the noise, we need good techniques for that. And we need governance. We need strong governance. We need to make sure that the data is being treated as an asset. That means it's protected. We don't lose it. It's not stolen. We have the right ownership. We have the right responsibilities in place to make sure that we're doing the right things. We also need some technology. We need technical infrastructure to support all of these things. So actually, it really doesn't matter what type of data we're dealing with. We still need all of these things. They're not confined to our raw structured data. They're not confined to our refined analyzed data. They're not confined to big data or small data. They cover everything. These are the principles we adopt. We need to do this to all our data, regardless of its fall, which is probably a good place to bring us to the DAMA will, taken from the DAMA body of knowledge. It describes the sorts of capabilities we need to manage our data. And here they are. So as I've just described on the previous slide, we have our governance. We have the things we need to do to own the data, to have the responsibility and accountability, the policies and procedures. We have the definition, the things that tell us what the data looks like, where it is, how it's flowing, how it's structured. We have the day-to-day management of the data. We have the data quality, the master data management side. We also have our document management. Again, this is kind of harking back to an older era where we're thinking about unstructured data differently. I think we need to start thinking about these things as the same. We actually start to merge some of these. We have the exploitation side. This is intelligence, data warehousing. Moving on to things like the predictive analytics, all the new ways that we can use to gain value from data. And then we have the infrastructure side, the things that need to happen in the background to support all of this. And this is the Damer Whale. This has been around for a few years now. And of course, the second version of the DMBoc will be out very shortly. And we'll see a new Whale. And I believe there's an extra segment on that. The Damer Whale is the perfect shape for many things. It has symmetry. It has beauty. It has practicality. But unfortunately, data management isn't perfect. Anyone who follows tennis will recognize that we're just approaching Wimbledon in the UK. So there's a lot of excitement for the tennis fans out there. I'm not sure a tennis ball in the shape of a cube could really work quite that well. But my argument is that we use our Damer Whale to describe the things we do with data. But it describes a very idealized picture. Data management isn't perfect. I'm not sure a circle is the right way to describe that. Maybe we need a Damer square. So what I've done here is I've taken those capabilities, the same capabilities from our Damer Whale, but I've put them in a slightly different way. I've tried to arrange them to try and show some of the relationships between them. Because actually the problem is with the Damer Whale is that everything looks similar. Our spokes of the Whale are all equals, and that's not the case. And this is no slight on Damer because the Damer body allows you an absolutely great body of work. But people always think of the Whale, and I think the problem with the Whale is it glosses over the facts that our data capabilities are intrinsically linked to each other. So I like the symmetry of the Whale. I think it's a really good way of getting people talking about management. I also prefer the transparency of the square. Something that tells us how these things relate to each other. So let's just use an example for you. We look at the data quality element of data management. So how does that relate to the other as well? Well, to have good data quality management, we need data architecture for a start because we need to identify the structure of our data. We need to identify where it is, where it's flowing. We need that to help guide our data quality efforts and to understand what the data really is. We need data governance because we need formal policies. We need roles and responsibilities around data quality to make it work. We need metadata management because we need to make sure we've got good definitions of the data. We need to define our data quality rule because those are things we're going to be testing our data against. If we don't have those, we can't apply data quality management. These things are our enablers. So a number of the disciplines on our Damer Whale are the enablers for data quality management. We've also got the beneficiaries of data quality management because it's only through good data quality management that we can get accurate insight, rubbish in, rubbish out. It's the thing we always say. So a data warehouse in our business intelligence relies on our data quality management. Our master data management relies on it because we need good quality data to support master data efforts. Okay. But at least we have a start. At least we have some data capabilities that help describe how we manage data. So we can use those to help develop our strategy for managing data. The question is, is our strategy working? Well, let's think about what a strategy is. Let's think about a military strategy to start with. So a military strategy is the planning coordination in general direction of military activities to meet overall political and military objectives. So it's very much a planning, a thinking exercise. It helps us coordinate, helps us line up our thoughts and decide what we're going to do and what to order. So what's the data strategy? Well, it's simple. It's very similar. It's the planning coordination in general direction of data activities to meet overall business objectives. It's the planning of our data activities that allow us to determine what order we're going to start developing these capabilities. Where are we going to focus first? I'm going to focus on particular types of data first. Are we going to focus on particular capabilities so that we start to prove the way we're going to manage data and then roll out further? So our data strategy encompasses this. This is our roadmap to how we're going to get better at managing data. How do we develop a data strategy? Well, there's just a few steps we take. And this is based on my own experience. So firstly, we conduct a data management maturity assessment. We look across the organization. We talk to people. We observe. We understand how good we are at the moment at managing our data. And from that, we determine that the overall maturity is extremely low because everyone is. Because this is hard stuff. This is stuff that people are still trying to get to grips with. Our industry is still maturing. So to be honest, most organizations when you measure their maturity, they generally figure towards the bottom end of the scale. So step three, we enjoy a painful meeting with our senior executives. We explain to them how poor our data management is. The full enormity of the challenge is laid bare and people start to sweat. So what do we do? Well, we agree with our execs a few short-term tactical fixes because everyone agrees. Everything else is just too difficult. It's just too big a challenge to rise to. So at that point, we just go back to our desk. We cry into our coffee and we just carry on. Now, it may well be that people have a different experience, but I've been doing data strategy for quite a few years and I've learned the hard way. The development of data strategy often creates a very negative response in the organization if you approach it from the typical strategic point, which is looking at where you are now. So I think we need to think differently again. So a strategy is a plan of activities to get us from the current state to a desired future state. But the problem is it's often clouded by negative perceptions. It's often clouded by people looking too much at where we are now rather than where we want to be. Everyone thinks about the issues. Everyone thinks about the challenges. It's a very negative start to a strategy. So at that point, it starts to become very tactical. People start thinking about the doability of it too soon. We need to worry about that, but we don't worry about that at the beginning. So really, our strategy needs to be driven by a high-level business vision. That is absolutely crucial. The vision has to come first. So what is a vision? Well, a vision is a picture of our desired state created by the business. It's very much a business-driven thing. It's not about data capabilities to start with. It's about thinking of data as an asset about what's it going to do for our business. How is the future going to be different if we really use information and data to drive it? So it's looking very much at our aspirations rather than our current issues. It's starting to think about the possibilities of using data differently, of using it in new and innovative ways, of being a game changer. This is the whole point of a vision. It starts to engage people. It starts to excite people. So it takes a long-term view. We're not talking 12 months. We're talking a few years. We're talking maybe seven or eight years, perhaps. We know that we're not going to get there straight away, and we recognize that we're going to have to break the standing into entering milestones. But we at least take a longer-term view because we know to achieve a vision that's worth achieving, it's going to take concerted effort over a sustained period. And of course, this provides a basis for our strategy. We still need our strategy. But at least if we create our vision first, we have a real business understanding of what we're trying to achieve, and we can then translate that into something which we can then achieve, which is very much aligned with our business. So my advice here is don't start your strategy until you've developed your vision. So how do we develop this data vision? Well, let me give you a few steps. So firstly, we need to engage with our key stakeholders. We need to really understand what are the possibilities of data for our business. That's very much a business perspective. Get people thinking about what it is we're trying to achieve. We're trying to connect with our customers more. We're trying to actually get inside our customers' behavior. What are the sorts of things that we might be able to do differently if we have the data available to us and we process it and analyze it in the right way? So when we talk to our stakeholders, we need to collect lots of information, lots of input, and we have to analyze that in terms of the drivers, the business drivers for information. The opportunities it's going to present. And also the challenges. We still need to think about the challenges. We're not going to dwell on them, but we'll need some appreciation of what these challenges might be. And we analyze these different things to identify some themes. Is it about getting closer to our customers? Is it about creating an information culture where everyone is contributing to our success through the good use of information? Is it about protecting our reputation through data by doing the right things by having an ethical approach not only to our business but also to the data we capture about our customers? Is it about making better decisions faster? Is it about working smarter and harder by cutting out all the weight, cutting out all the manual exercises that we all want to do with data and streamlining our approach? And when we think about this, we turn any negatives into policies. So when we think about the challenges, we turn the challenges into the positives by crafting a description of the brighter future in which that challenge has disappeared and which that challenge has been resolved. So we can do this for each of our themes. So from our drivers' options and challenges, we've come up with a number of things. Maybe it's 10, maybe it's 12. And now we can start to describe what does the future look like in each of those themes. Can we paint a picture of the future that really means something? And then we can present that to our senior execs. And we don't present it to them as a list of bullet points. We present it as an exciting, engaging description. We use imagery and metaphors to bring that vision alive. And we agree this isn't going to happen overnight. We agree that it requires sustained effort if we want to achieve that vision. We make it clear that vision is achievable but only if we carve it into some interim milestones. We're still thinking about the future at this point. We're still thinking about where we're going, not where we are. And as we start to develop that into a fully fledged strategy, we very much keep the vision at the forefront. The vision doesn't get lost. The vision gets updated as we learn more about our future aspirations. But we keep that at the front and we develop the strategy from it rather than start off with a strategy and immediately start off on a back foot. So I've developed a few of these visions with our clients. And I mentioned about using imagery. We recently did this with a water company, a water business. And so we used water as a metaphor, not just around the fact that they are traditionally a water business. That was what they did. They served customers in terms of providing water. Actually, the metaphor also was a good way of describing what data is, the fact that data changes state much like water. It flows, it flows much like data flows through a business. But it needs to be managed. It needs to be protected because it's a valuable resource. And this metaphor worked really well. And we've combined different images to try and describe the different aspects of water to bring out each of our themes. And it helped us really engage at a business level, something that wouldn't have happened if we'd have gone in and started talking about data and data governance and data architecture and data integration. These terms have been alien to our business. It's a sponsor and our business is getting executives. We had to create a vision of the future that could really relate to and tie it back into something that really meant something to them. In this case, it was water. So I've talked a lot about my different perspectives on data and on information. And I like to ask you whether you think it's time for a rethink because I do. Because I've been working in this area for many years. And after a period of time, you gradually realized that some of the things we do, we just don't think. We just accept the way we've worked previously. We accept what's gone before. And we stop challenging it. And we stop thinking. We stop rethinking and can't do that. So I'd just like to close by trying to summarize what it is I've tried to explain this evening. So information is simply our interpretation of the complex world around us. That's what information is. Digitalizing information and using computers to store it is data. It helped us achieve things we never thought were possible even 10 years ago. Unfortunately, and I rushed to push the limits of achievement and embrace the possibilities. We've really confused our thinking. We've messed things up. We've created artificial barriers where there aren't any. Because ultimately, it doesn't really matter whether data is big or small, whether it's raw or refined, whether it's firm or fluid, whether that data has been created by a human or created by a machine, any of the data is just data. So our technology might have advanced. We might have come far more sophisticated with our technology. But our goal remains the same. Our goal is a unified approach to governing, defining, and managing, and exploiting our data. Now, aren't we there yet? Do we have the theory of everything for data? Just as Stephen Hawking was attempting to create a theory of everything for our universe, well, we're not there yet. But I don't think we're that far away. We're a lot closer than you think. If only we start to think differently. And probably it's only fair to leave a last word to Stephen Hawking. Intelligence is the ability to adapt to change, and that's what we need to change. We need to think differently. We need to adapt. And thank you for listening. And over to you, Shannon. John, thank you so much for this great presentation. Stephen is definitely one of my heroes as well. Questions coming in already. If you have a question for John, just submit it in the bottom right-hand corner in the Q&A section. One of the most popular questions we always get, people are asking about if they'll be receiving a copy of the slides and a copy of the presentation. Just a reminder, I will send a follow-up email within two business days containing links to both, as well as anything else requested throughout the webinar. Now, just some statement here for you, John, that maybe you want to comment on. Data or information, data is the raw material. I feel strongly that what I do is data quality. I'm working on data quality for the IRS compliance data warehouse. Researchers use the data and turn it into information. They need to be concerned about information quality. The International Association for Information and Data Quality has rebranded itself as IQ International. Any comments on that statement? Yes, and so the idea that information is derived from data, that's a long-held belief. I kind of understand what people are saying there, but to say that we manage data quality, we do, but let's not say we wouldn't also manage information quality. If you'd still like to think of the two terms as the progression of data into information, you still need to manage the information itself and the quality of it. I'd like to think of things differently now. I like to think of actually all of it as either whether you use the term data or information. It is a continuum. The fact that there is this long-held belief that suddenly data becomes information and we do things differently, I think, is a very traditional way of thinking and I don't think it's particularly helpful moving forward. So my advice, the approach we're starting to take with our clients is to approach data information as a continuum and try and create a unified approach where we deal with both, because otherwise you end up falling into our old habits and we know that actually we do need to think slightly differently. Sarah, in the comments, it's coming in with a response. So thank you, I see your point. People are being awful quiet today or just taking the time, entering in some questions. Oh, it's not too much to take in on one go. But just to mention, Shannon, I'd be very happy if anyone does want to contact me after today's talk either by email or Twitter and if anyone has having slept on some of the things I've mentioned today and have come back with loads of extra questions, I'd be very happy to take them at any time. I love it. Thank you so much. I'll make sure to include your information as well in the follow-up email. We do have another question coming in. I consult for an organization that is beginning their quest for knowledge management within their ITFS department versus from within the corporate side of the business. What are your thoughts on this? So as I described in the talks, knowledge management has always been, in my experience, very much a poor relation in that people have left it to the heroics of staff to continue doing all the good work that they've done and so all the knowledge tends to be in people's heads. So if it gets documented, so the processes, the procedures, you know, the best practice. So when it comes to actually tackling knowledge management, my advice is start to adopt some of the rigor you would if it were other types of information and data. So some of the approaches around governance, like ensuring you have ownership of your knowledge, that's equally important as it is with your information to ensure you've got standards that when someone defines some best practice, it's defined consistently. So really our knowledge that we're trying to capture and manage is just more information and we have to adopt exactly the same approaches we do for other types of data. So why have two or three different approaches with data, information, structured information, unstructured knowledge? Why don't we have one unified approach? We're actually trying to apply the same principles to all of it and those principles being we define what we need, we manage it well, we exploit it, and we govern it. And that's my advice on that. Right, so thanks. Boy, our attendees are really quiet today, see? I'm curious. I was so busy and so engaged, but everyone's quiet. You know, what is, you know, John, on your thoughts, what's some of the biggest pushback that you get in your thought series here, in your thoughts? The pushback. So I think it is really just getting people to take a step back. Everyone is so busy running around managing data, often using very inefficient approaches because everyone's too busy to think differently. And one of the things I really enjoy when I give talks like this is giving people an opportunity to just step back from their day job to think, actually, is there a better way? As I said, you know, Daima do fantastic work to bring together, you know, best practice and body of knowledge. But even then, people are having to find time to do this, and often we need to all pull together to just try and really sort of think about the industry we work in and sometimes we run at 100 miles an hour and sometimes we just need to slow down a little bit and just think, what is it we're doing? When people are starting to introduce in the context of big data, immediately everyone thought, oh, great, but we need a new way of tackling it. Well, I'm not so sure we do. We might need some clever technology. We might need a few extra things, but essentially it's just data. Why should we give up the best practice that we developed all these years just because we're describing data as big? So the pushback I get is often around we're just too busy to think differently. So part of my job, part of my role is to just provide that catalyst in organizations, just let them think about it differently. Because ultimately, you save time, you save money by having one approach and you apply that approach regardless of what the data is. You just maybe need to have different approaches for different types of data. You have one overarching approach and you just apply it rigorously. Makes sense. Okay. All right, next question coming in. How do you engage the business side of the organization to take ownership of this and stop IT making everything an IT issue? Yes, indeed. The most important thing is to find a sponsor. The most important thing is to identify someone in the organization who not only has the right level of authority, someone senior enough, but also someone who's feeling the pain of poor data management, someone who's willing to put their name to doing things differently. And actually at that point, once you've got, and you need to find that person in the business, that isn't an IT person, that's someone in the business who's a beneficiary of good data, both themselves and the people who report to them. Rely on that. So if you can find that good sponsor and start to develop, as I've said, the business vision, start to get the business stakeholders excited about using data and not thinking of data as just almost like a side effect of IT, something that is a burden, something that has to be managed rather than something, it really has to be something that they want to manage, something they really want to get value from. So the most important thing is to very much engage with the business. The way I tend to do that is to create the stories. What I showed you there was some of the imagery we used with one of our clients. We actually created a story for them around data. You know, we helped bring it alive, so it really meant something to them. It didn't talk about data in data terms, it talked about business and how data was going to support the business. All right, I think we have time to sneak in. One more question here. Can you suggest any resources for selling information management to senior leadership? Any resources, okay. I will struggle to think of any resources that I'm aware of that really tackle that job. I've seen some very good presentations on the topic from some of my colleagues on the speaking circuit, but I must admit I've not seen any good resources that you'd be able to just pick up and use as a manual. I think the problem with our industry is we're still very much maturing our approach. Even though we've developed lots of different techniques for managing aspects of our data estate, I think some of the stuff around the edges, you know, selling data management and the whole people side, I think a lot of that stuff is still struggling to catch up. We've been very much a technology-led industry for a long time and I think we're only now realizing that most of the problems with data are to do with people. Not only people and how they use information, but whether or not they value the information, whether or not they're going to support data initiatives. So most of data management in my eyes is a people problem. But we're getting there, but unfortunately there's no resources I could just point people at that would help them sort of wave that magic wand. Of course, do come and speak to me and maybe a conversation might be useful in a particular resource. John, thank you so much for this fantastic presentation. It's just been a pleasure. And thank you so much for attending in the early hours or the late hours of the evening. Let's appreciate it. And thanks to our attendees for being so engaged in everything we do. We really appreciate the involvement that you bring and the discussion that you bring. And just again, one more time, I'll remind everyone else in a follow-up email for this webinar by End of Day Thursday with links to the slides, links to the recording, and John's information as well. Okay, thanks, Sarah. Cheers. Okay, provides.