 Hello, I'm Steve Nunn, President and CEO of the Open Group. Welcome to Toolkit Tuesday, where we highlight the various components and leading experts of the Architects Toolkit, a collated portfolio of the most pertinent technology standards for enterprise architects. During the series, I'll be calling on a number of recognised experts who will bring their particular insights on how to most effectively use the various tools in the Architects Toolkit. We'll have a mix of interviews, panel sessions and pre-recorded presentations along the way. While all standards of the Open Group are designed so they can be adopted independently of one another, the greatest value for an organisation can be derived when they're used in unison, for some of the parts should be greater than the whole. In the Architects Toolkit, we have collated a portfolio of the most pertinent ones for architects, together, all in one place. For most of these tools, set-ification from the Open Group is also available, so practitioners can demonstrate that they have the skills required and recruiters can take the guesswork out of the recruitment process, all backed up by our Open Badges programme. So, my whiteboard, I've written BIAT. Ironically, many are lured into asking, why have you written bait on your whiteboard? Others are simply curious about the symbols. But it was actually me illustrating a rant. The first triangle is how, historically, I think we've focused on the tech layer and neglected the others, like business, information and application. The end triangle is the desired state, in my opinion, all about the business, then the information, which is the lifeblood of our systems, and down. Things like modern compute, cloud, serverless, should reduce the need for focus as enterprise architects on that tech layer. But I fear all it has done is simply bloated out our attention on the apps layer, an obsession with the development journey. I'm not sure the freed up attention has to focus sufficiently up the stack to my liking. As I say, a rant, but most importantly, a whiteboard in use. Welcome, everybody. Welcome to Toolkit Tuesday. Good morning, good afternoon, good evening, good night, evening in some cases. Wherever you are in the world, I hope you're keeping safe. And thank you for taking time out of your day or evening to join us. Before we go on with today's main talk, just a thank you there to Paul Holman of IBM, one of our resident experts here on Toolkit Tuesday. Whiteboard in use indeed. Great thoughts as usual from Paul. Nice little one minute sharing of things that are relevant to architects or could be relevant to architects. Very thought provoking as usual. Thanks, Paul. And as I say, welcome to everybody wherever you are. Do share with us if you feel able to share with us where you are joining us from. We love to see the different places in the world where our participants are joining us from and it's always great to see. And today we have a lot of registrations for our topic on data and I'll come to that in just a moment. So I know there are a lot of different countries there and they're starting to come in. Thank you for doing that, folks. So just a quick housekeeping item. We're going to have a double act today, some two speakers. And if you want to ask questions of the speakers, please do that in the Q&A channel. So many of you are finding where the chat channel is to tell us where you're from, which is fabulous. If you want to ask a question so it doesn't get lost in the chat, please put it in the Q&A channel. And you will find that if it's not there on your screen already. You will find that by clicking on the three dots in the bottom right hand corner of your screen. And that will give you the option to click on the Q&A channel and please put your questions in there. That's where I'll be looking for them. So without further ado, we'll move right on. As I say, our topic today is data. Data is the new gold, the new oil, however you want to put it. So important to all of us as individuals, but very, very important to organizations and just the sheer amount of data and the different types of data has increased so significantly in recent years. So that's great in a way, but how do you actually integrate it? How do you actually use it and make use of it? So to tell us a little about some of the work that's going on inside the open group and some potential approaches for tackling this issue where we've got two speakers today. I'm delighted to introduce Dr. Chris Harding was a longtime colleague of mine at the open group before he founded Lassibus Limited. He's founder and principal of Lassibus and he formed the company to provide services based on virtual data lakes and data centered architecture. Now a virtual data lake enables applications to connect to data sources and mix and match the data. They provide fine grained access control so that the owners of the data can control its use. They enable data centered architecture of IT systems in effect. And Chris developed the ideas that led to the formation of his company while working as director of the Open Platform 3.0 forum of the open group. So welcome back to an open group event. Chris, great to have you back with us. And joining Chris today is another long term, very long term member of the open group and an absolute advocate for the importance of data. Ron Schult is manager of data harmonizing LLC providing data integration training and consulting services. He is chairman of the open group semantic interoperability work group, which you'll hear a little about today. And he's responsible in that group for the open groups data element framework or ODEF standard. Ron is retired from Lockheed Martin where he had over 28 years of experience as a systems engineer working systems design and integration. So a warm welcome to talk it Tuesday and from the open group to Dr Chris Harding and Ron Schult over to you gentlemen. Okay, first key question when we're talking about data integration is what is it. To make sure we're all on the same page. It's machine to machine exchange of data. It is not machine to person nor person to machine. A simple example, in a typical enterprise, you have a human resources department and typically they control an HR type of application. One of the entries that an HR person might be responsible for is to authorize the pay increase for an employee. And so that pay increase would be captured input by a person in the HR department, which would be a machine, a person to machine interface, but then that that HR system would need to have a machine to machine interface with the financial department and their accounting system. So machine to machine is what we're talking about for where data integration is required. And with that, I'll look at what typically drives the need for data integration, perhaps the example I just illustrated but normally that's already taken care of by some ETL tool, extract, transform and load tool, but other things might drive it for the enterprise. For example, having two or more systems that you want to compose information with and presumably you do it once and you keep that transformation available all the time. But sometimes you bring in some new data from a system that wasn't previously there. Also business growth and the need to add new applications. Obviously, a small business might be able to get by with just simply operating on spreadsheets, but eventually if they grow to any size, they will need additional applications supported by different departments. Obviously, if you merge with another enterprise, there's lots of data basically, you know, other than perhaps some unique product kind of information. Most of the information needs to be merged with the with the new enterprise that is a much larger one, presumably also consolidation of multiple application into a single application. Many companies have moved to ERP type applications and it's a rather complicated and expensive process to merge all your data into a single application. Obviously, if you want to reduce your operating costs, that can be another reason for driving the need for data integration. Sharing data within an industry or the government or a government. There are many examples of that. The mining industry is looking to move into that realm where they have no standards and they're looking to be able to share data across an industry. Also governments wanting to be able to share data across all government entities. And then the need to share business or public data across federated entities that are perhaps all part of the same company or corporation, but they exist in different countries. And there obviously you run into a challenge with the perhaps different languages, natural languages. And then last but not least, if you want to differentiate yourself. And so there's competitive pressure then to to reduce your costs and be competitive across the board. And I'll turn it over then to Chris. Oh, thank you, Ron. And that gives a good summary of the reasons for doing data integration. As Steve said, I'm the principal of Lake of us and we produce a virtual data lake, which is effectively a data integration platform. And as such, I participate in the data integration work group of the open group. It's part of the architecture forum. And it's aim is to create a body of architecture artifacts to help architects with data integration and also a framework to stitch them together. We've published a white paper on technical standards for data integration and are looking to define our forward work program. And in order to do that, we undertook a survey of enterprise architects to find out what are the problems, the practical problems that architects face in this area. We did an initial run with the architecture forum members, and then surveyed the a a members in all we got over 600 responses, which is a very good basis for understanding the problem and planning going forward. The picture it gave us of data in the enterprise was of business leaders who mostly view data as a strategic corporate asset, but the data use is often localized by business unit. There's a mixture of data in the cloud and on premise. Overall data quality tends to be mixed. Obviously enterprises differ on this. Some excellent some terrible most somewhere in between. And there are often islands of quality data with different management regimes. As far as data integration goes, we found that analytics and decision support are important reasons for integrating data. But operations and transactional are the most important. And those are the kind of reasons that Ron was talking about. The data to be integrated most often comes from databases, but a surprising amount is in electronic documents. And there's also some from Internet of Things sensors and so on and from social media. Mostly the data to be integrated is from within the enterprise, but there's often a mix of external data with that. So if we could move to the next slide. Yes, the pain points that came out of the survey was lack of commitment for business units, the business units don't understand why they should assist with data integration for not contributing to their particular business unit bottom line. There's often a lack of commitment to corporate level to, for example, finding the funding for enterprise data integration platforms. There's a whole mix of heterogeneous sources and tool stacks conflicting data models and Ron will be saying a bit more about that later. And there's often not a real culture of data management within the enterprise. Obviously enterprises differ on this, but lots of enterprises don't really have good data management. You get inconsistent data from different sources and duplicate records and poorly managed data. The question is how can we make data integration easier. And in the context of the open group, this is by applying architecture methods and techniques. And so what you see here the existing work done in the open group is effectively the current open group data integration toolkit. And the perhaps the most important part of this is the standards in the digital portfolio. This is a concept that the open group is now organizing its important digital standards under and there's going to be a toolkit Tuesday on that I think in a few weeks time. So come to that if you want to find out more about the digital portfolio. There's a lot of other work as well. There's the open subsurface surface data universe has defined a an open source standards based technology, agnostic data platform for the energy industry. The open footprint forum. If you think about it, a huge cross enterprise data integration problem is involved in finding carbon footprints. And the footprint forum has defined a common model footprint related data covering all types of emissions. It hasn't actually developed a platform but has pretty much specified what you need as a platform. If you go into business domain standards, this is important for understanding the data that you need to integrate in your business. The exploration and mining business reference model and capabilities map dates now from some years ago. And more recently, the open group has produced a governance government reference model to define a taxonomy for government operations. The health care forum in the open group has taken on stewardship of the United States federated health information model. And the commercial aviation work group in the architecture forum has developed a commercial aviation reference model, which has more than data in it, but does include detailed reference models for data. And there's also the the ODF standard, which Ron will say more about. And last but not least, there's the data integration technical standards white paper, which I mentioned earlier that the data integration work group has put together. So if we can then move to the next slide. Looking at the the key standards in the in the in the technical technology. Sorry. The key standards that the open group produces that relate to architecture development methods and that are in the digital portfolio. The most important one has to be togaff. If you look at the pain points that came out of the survey, you see that the business architecture phase. If you do a good business architecture. This will give you the ability to obtain commitment at the corporate level and from the business units will help the enterprise to understand what it's costing them, not to have proper data management. Organize and reconcile those conflicting data models in the information systems architecture phase and the technology architecture will be where you think about dealing with those heterogeneous sources and tool stacks. So togaff is a framework within which you can address those problems. If we move to the next slide. The open agile architecture standard is a new standard relatively new in open group terms, and it introduces three perspectives for thinking about the problem as well as and perhaps the most important thing it does is give you the ability to develop architecture in an agile way. The experience perspective looks at the presentation of the integrated data so makes a big difference. If you just give people a table or do you give them a heat map perhaps with an interactive query facility. The work system perspective is where you look at how the data is put together, and there's a lot of interest now in data ops as a continuous data integration technology. And finally, when you look at the technical perspective, that's where you might look at the data platforms you need to support your data integration. So the DP box is a humongous collection of knowledge on digital practitioners what that digital practitioners need. It does contain a chapter with a section on data integration and the system of record, which we mustn't forget because master data management is a de facto mainstream technology that is used for keeping data in the enterprise in a reasonably integrated state. And there's a Togaf guide now that describes customer data management, which backs that up and gives you more detail. So there's a lot going on. I won't really say a lot about these trends because we're running short of time, but you can see there's some really exciting things happening in the world of data integration. If you look into your crystal ball, this is what it will come up with. So looking at possible future work, I will now hand back to Ron who will say what's going on in the semantic interoperability workgroup. Right, within within the semantic interoperability workgroup we are put in the process of putting together a data integration white paper. It will focus on the semantics. So what's an essential first step in data integration. It's dealing with the semantics. It, you know, the tools by themselves generally can't handle differences of semantics. Obviously, AI tools can begin to do that, but the semantic mismatches having the same word, having a different meaning between two different systems, or something that is described with one word in one system and a different word in a different system. And then those two are actually have the same meaning. There's lots of semantic mismatches and it becomes even more complex when you're dealing with with different natural languages. So the purpose of the open group ODEF standard is to provide a starting point for eliminating those semantic mismatches. And, and describing it compared to a data model in a data model you've got entities, relationships and attributes or ERA. And within the ODEF, the equivalents there are we've got the object classes are essentially the same as the entities. And then, and it's a basic set of them. It's a fundamental basic set and then a starting point, if you will. And then for relationships, we've got roles. We've got a starting point of roles. And then for rather than attributes, we've got properties, a starting point of properties that are based on an ISO standard. So, the current version of ODEF is version three, which is available at that link in the publications at C223. And in addition, you can find more information about the ODEF from an open group site, opengroup.org slash ODEF. And I'll turn it back over to Chris. Oh, this is our white paper. If you want to get involved in the white paper, or within the interoperability work group, contact me with my email or contact Judy Sarenzia, an open group staff person. Okay, there's your. Okay, so finally, and I should mention that I previously put up a lot of references to open group work. We will be posting in the chat. I think the URLs that you need to follow those up and to find all these things. But the data integration work group having conducted its survey. The thing that comes out as to what would be useful for this group to do is the data integration guide on how to use the open group architecture standards for data integration. Using the toga stand at the OAA stand at the DP box standard and the other work I referenced. And I think the couple of the slides I went through gives you a flavor of the kind of things it might cover. We started off by research into the key use cases for data integration. Participation will be open to architecture for members in the open group. And if you are interested in this, and I hope you are, please contact either myself or Dan Hutley, who is the forum director for this work in the open group. So, thank you. And we're ready to take questions, I think. Chris, Ron, thank you both very much for that. It's a lot to get through in a short time these two kick Tuesdays are limited time I realized and there's so much so much depth there so I think the I'd urge anyone interested in the topic which I hope you all are to look at some of the references and the links that are in the chat and they will also be in the version of this presentation and slides that are made available after the event. So do go for more information and and follow those links so thank you both gentlemen. Question that came in fairly early on in fact right on your first slide Ron. Okay, was some was a comment that is obviously a question to because I'll ask you to comment on it. I'm amazed that this slide does not emphasize the risk reduction achieved by removing unnecessary human intervention in end to end business processes. Absolutely. The substantial amount of, of energy, and this is, you know, man hours, person hours, however you want to describe it, goes into trying to figure out what is meant by what data, and, and if you have bad data, then you are not being able to use the data the way it was intended. So, obviously data integration that keep part of the benefit is risk reduction and having clean data. It's easier said than done. And I don't know of any enterprise that has been able to do it effectively, but you've got to start with understanding the data, both the source and the destination. And it's a very labor intensive effort. Absolutely. Yeah. There's a, there's a reason we have the term human error as well, isn't there? As we are, we are all human. Make mistakes. Okay. So what with, there's so many things and one of the reasons we do these toolkit users is there's so many different things that are coming in and impacting architects nowadays and different technology. Obviously, one of those is is cloud lots of organisms either moving to the cloud have moved to the cloud. It changes. It changes things. So, how does the growing use of cloud affect data integration? And is that something that we might work on in the open group? Maybe you want to take it. So, yeah, I will have a go at that. It's a huge subject. I think it certainly is something that we could work on in the open group. The initial flavor of cloud really was the first generation cloud was just moving things that you could do on your own computer onto the cloud. We're now into what you might call the second generation cloud, which is cloud native applications designed specifically for the cloud using perhaps a microservices architecture. And I think that data integration inevitably will be moving to adopt this kind of architecture. And how it does that will be a matter of, I think, some, some interest to enterprise architects. So yes, I think that's a great forward topic for the open group to think about. Okay. Thank you for that, Chris. Ron, I'm going to come back to you on a question on semantics, dear to your heart, obviously. You know, how the question is how do recent developments in AI and natural language processing affect semantics in data integration? Do they make it easier or worse? It can make it easier, but you're still even in AI, you have to spend time feeding those systems with trial cases so that they learn what the meaning of those words really are. And if you can have a good starting point, such as what ODEF can provide, and actually I'm working with someone out of Finland who is also an AI expert, and he sees great value in the ODEF. And I'll just leave it with that. Yeah, yeah, please, please, folks, do go and look at the ODEF. The link is in the chat. Gentlemen, we are out of time officially, so I do need to end it there, but thank you both for sharing your knowledge and many years of experience in the topic. And anyone interested, please do get involved, go to the links, and we'd love to see participation in the relevant groups within the open group. So for now, warm welcome for Dr Chris Harding and Ron Schultz. Thank you guys. Thank you. Thank you. Thank you. Thank you for everyone for listening. Thank you. So don't don't leave us just yet. I do want to tell you what's coming next. But also, I mentioned earlier on that we encourage you to let us know where you're joining from. And we always get a lot of top, a lot of different countries and we did today too. And one in particular stood out. I don't usually do this, but one in particular stood out for me. We have somebody joining us from the Ukraine. So Pavlo Revinkov joining us from the Ukraine. Thank you for taking the time out. I've said many times on these events every few weeks, our thoughts and prayers are with your people at this time. And it's humbling to have you take time out to join us today. We all hope this is, this situation is over very soon. So thank you for doing that. Next up, in two weeks time, June 14, two weeks time our next toolkit Tuesday is should be of interest for you, I hope it is a presentation from my colleague Mark Dixon, architecture forum director at the open group on the Toga standard 10th edition, what's new and what's different. So, many of you will know, and if you don't, then I'm delighted to be the first to tell you that we recently announced the update to our Toga standard. And there's a lot that's new and different it's structured differently it builds on the past but there's lots of really great new material in terms that really is focused on how to use it. So, please join us in two weeks time on June 14 to hear what's new and what's different in the Toga standard 10th edition. So that's it for today folks. Thank you for joining. I really appreciate it and look for the slides and presentation to be available you will be notified as you've registered for this event. And keep safe wherever you are in the world and thank you for joining us. I'm Steve Nunn and thank you for joining Toolkit Tuesday. Bye for now.