 OK, welcome back. Last session of the day. Previous session, we were talking about the operator's perspective and about value creation for the operators. Now a big part of the community in generating that value is by the cloud service providers and vendors. So this section will have four presentations by the CSPs and ISVs. Start off with extending vendor data platforms for interoperability. So this will be Beth Reedy from Halliburton. And what's nice about this is it's not simply us delivering a commercial solution on OSDU as you consume from landmark, for example. But actually how we're leveraging the platform in service delivery across some of the other Halliburton PSLs. So Beth, I'll invite you on to the stage, please. Hi, good afternoon or evening. Hello, my name is Beth Reedy, principal product champion at Halliburton, where I pretty much own the data warehouse and have really just recently taken over ownership of the data lake. I've got over 20 years of experience in data warehousing and analytics. But really only three of that is in oil and gas. And then half of that is in production upstream data. So I'm really very much still just learning all the data that's coming in from our individual PSLs as an oil and gas service provider. And then I'm even newer to OSDU, so I'm really interested in seeing how we can further leverage OSDU and OSDU concepts as we move forward. So today I'm here to talk about, oops, I remember to do that one too. So today we're here to talk about some of our data challenges as a vendor, starting with data collection and discovery, and then extending that into interoperability through OSDU APIs, and then discussing one of the mechanisms we're just now starting to build out, which is to reorganize our data based off of data domain. So well engineering is a discipline that in part builds upon the learnings of what was done before, how other similar wells were previously constructed. So its success is subject to the amount of accessible and collectible offset well data that can be made available. However, information from drilled wells is often scattered across systems and geographies. Data could be trapped in proprietary data formats, and that makes it really difficult for youth to then integrate that data. These challenges increase the efforts required to find, assemble, and prepare information to design the next well. One of the most important first steps is to actually centralizing that data and the data access. So one of the big challenges that we face is that with our internal data collection application called INSITE, is that we are generating over 14,000 individual well database files a year. Those files are then parsed into approximately, oh, hi. I got to coordinate. Sorry about that. So those files can get parsed into about 4.4 million time in depth based records. And then that is just the beginning. Then there are unknown countless hours spent on data engineering and wrangling of that data. So as I mentioned, INSITE provides individual well database files and has been used for decades across multiple of our PSLs or product service lines. The data is collected and archived in a single well, single service line proprietary ADI file. Each of these instances of INSITE is actually installed on individual laptops or workstations for each rig, and then they can be configured differently, which is great for those in the field who are only looking at a single well at any one time. However, this becomes really problematic when there's no global or company-wide consistency. Currently, this data is also then trapped inside these individual INSITE instances and requires that this data is manually imported or exported, and still you only get a single view of a well. So if you are trying to look across many offset wells, this is not an efficient or very scalable effort. Our first initiative to solving this issue was to ingest these ADI files and put them into a centralized data lake. But the initial collection of the ADI files still does not make it usable for a multi-well analysis, because those files, again, can only be read by INSITE and you still only get a single well view. So our next step became to actually parse out those ADI files into stored columnar-based Parquet files. The initial phase and the initial purpose of our data lake was to become a centralized collection point of historical well data, while then we also have our RTS platform, which is for the real-time data. The next phase starts to transform those ADI files, again, back into that Parquet file format that I just mentioned. And from this point, some of that data is further than processed using ETL via Databricks and Synapse Pipelines to take that real-time data and the historical well data, aggregate it and combine it into a data warehouse. This is for more internal analytics and business intelligence uses. So wanting to anticipate our internal PSL's data needing to be accessible via OSDU for operators and customers, we started an initial phase of working towards OSDU compliance. Our first venture into this started with utilizing the physical ADI Parquet files that we just parsed out stored into the data lake, which is a component of HDU or how data universe. When the HDU data lake would receive files, a JSON message was generated and sent to the HDU OSDU service. That then triggered the processing into OSDU through the usage of various APIs. So once that data was extracted, we still had to do further cleansing, mapping, and enriching of that data to actually meet OSDU requirements. Depending on the file type and the well, it was mapped to different custom OSDU schemas. After that, using a manifest ingestion workflow set of APIs and other APIs that transformed data was then provided to OSDU. From that point, OSDU data ingestion pipelines loaded the data into the appropriate data model and storage. Once into OSDU, we validated and monitored that data workflows to ensure that the data was loaded accurately, completely, and up to date utilizing OSDU search APIs and elastic search queries. So that was our first venture into utilizing OSDU with our data lake. However, we were still finding that, even though we had parsed out the data lake files into our original ADI files into all these Parquet files, we were still having a lot of issues with the organization of the data, the fact that we had varying schemas within the data and that we had unit of measure issues. So we took a step back and we looked at our data lake architecture. So pretty typical industry standard data lake architecture usually have about three zones or ponds, whatever you want to call them. And they can go by different names. Kind of just depends on who set them up or whose paper or book somebody read. But they still generally have the same functional purpose. So that first zone, that first area is for ingestion. Frequently you will hear it called bronze ingest or raw. And it's all about getting data into the data lake. It can be in any source format. In our case, we've got the ADI files. We could have JSON files. You could have CSV files or any other kind of file format that's coming in. And it's all about getting the data in. So then we go into the next area of the data lake, which is really about transition. Because we got to go from that operational data into analytic data. So that transition phase usually has some levels of data transformations that are applied. Could be data filtering. Could be all kinds of other things. Whatever is needed for you to start to prepare your data for the next layer. Frequently this area is called silver processed. Again, numerous names. But the all important part is to get that data to that consumption layer. Consumption layer is frequently called gold or curated. So after we reviewed how our parquet files were kind of set up and laid out, we determined that while, yes, we had made it more accessible, we'd made it more readable. But we were still very much organized in an operational format. So we really weren't ready to classify it as it truly being that transition layer. And it was still really kind of more part of ingestion. So that's why we've got that staged parquet file area. These first parquet files were stored in our data lake within specific PSL containers. And then further broken up by what we call a record or just end description, which you could equate to a table or a data set. But ones with varying schemas and non-consistent data. So within those sections, we were storing individual files for each well and run or section. And then by timestamp in which it came into the data lake and was parsed. So we do now have multiple wells stored together within the same area. However, if you wanted to do multi-well analysis, you were still having to know every single individual well file that you wanted to go pull. And you would have to then take all those individual files and bring them into data breaks or any other query or processing tool that you were using. So now we get to the part where it becomes critical that those ADI files being individually configurable starts to really have a big impact. So with those ADI files, when you're pulling from multiple wells, when they can have different schemas and different units of measure, you have to do a whole lot of data engineering and repeated data processing to get them into a usable state. So this is kind of where we came up with the concept of moving our data and reorganizing it into domains. So the idea is that instead of it being organized by PSL, anything related to something like well construction or drilling or our well bore data would all go within the same domain. So now you don't have to know your individual files where you're looking for data. You just need to know what area of the day like you're going to look for it. So that's still only addressed part of our problem. We still had the problem of the varying schemas. So in order to deal with that, we are further classifying the data into what we would consider a subdomain. And if you think about it, that would be really based off of granularity in the load type of the data. So for example, drilling time series curve data would all be stored together. Data that we have that's really at a run or section level would be stored together. Metadata would all be stored together. So this would put us in where we could set up a never changing schema. And it would be a mix of auditing metadata, source metadata, unit of measure detail, and then select record key and filtering columns that would be, quote unquote, flat or repeated while all the other remaining fields would then be stored in name value pair. This allows us to be completely flexible, as flexible as our source data coming in, and then also be completely flexible for our use cases and the curated data sets going out. Because changing schemas from the source would not affect our downstream processing. So in conclusion, we believe that by connecting data, it will improve efficiencies just starting in terms of actually getting that data together, and then that will actually lead to efficiencies in the field. Accessibility will increase use case success as service providers and operators are able to share and utilize data based off of OSDU standards. We are moving towards a path of data reorganization based off of domain and to better support our changing data landscape. So our journey for getting our myopic raw data to a usable state for offset well analysis, machine learning, and AI projects, other BI and analytics needs is still ongoing. We continue to try to improve how our data is made available internally and externally by leveraging OSDU and OSDU concepts. And that's it. So thank you.