 Okay, so now with today's presentation, I'm gonna try and avoid acronyms, but the two terms that come up really often in this context are thematic research data commons and the people research data commons. So I might end up using the acronyms thematic RDC and people RDC. But let me start with acknowledging the traditional owners and acknowledging and celebrating the first Australians on whose traditional lands we meet and for me that is the voluntary people of the Kulin nations. And I pay my respects to the elders past and present. So now before we go, I'd like to set the context about the ARDC. So our purpose is to provide Australian researchers with the competitive advantage through data and we do that by accelerating research and innovation by driving excellence in the creation analysis and retention of high quality data assets. And some of the examples of where we are working with research communities in doing that are these three exemplar programs. The first being Hisanda, which is Health Studies Australian National Data Asset Program. This is a program that is focused on making fair data from clinical trial, NHMRC funded clinical trials and cohort studies. This particular project includes over 91 collaborators from 72 organizations and which consists of 18 universities, 10 medical research institutes, 19 health service operators and 16 clinical trial networks. So it is a significant initiative that demonstrates ARDC's expertise in the creation of data assets and also national leadership in able in bringing these stakeholders together to deliver this program. The second program is the Translational Research Data Challenges program where ARDC has again moved away from the open call model through to a strong consultation facilitation process to identify the capabilities needed for the sector in addressing societal challenges and the data challenges that underpin them. Particularly this program demonstrates the ability to bring together agencies and to partner with them, particularly agencies on a national scale and to partner with them on the pathway from research to impact. Now the third program here has been the HASS research data commons, which is in the humanities and social sciences and indigenous studies, indigenous research space. So where this program again is focusing on the needs of a particular community, identifying what their challenges are. The particular context of the HASS RDC was the department had identified the four projects and programs that were the focus of initial development of digital capability. But in this particular, the approach we're taking with the thematic RDCs is actually talking to the communities to identify what the community requirements are and how we can meet those challenges. So the takeaway from I think these three programs is the ARDC's extensive expertise in working with communities nationally and bringing them together and to address data challenges. Now coming back to the people research data comments, I'm putting the slide up at the start to set the context for today's discussions. I will come back to the slide at the end of the presentation to try and formulate the value proposition for the people RDC. Now, so for today's discussion, we're here to discuss what we are calling the people research data comments which is anticipated to support health and biomedical research. So the aim of the people RDC is to enable national scale cross sector, cross jurisdiction and multidisciplinary data collaborations to accelerate research and research translation. Now, the primary objectives of today's round table are to examine what the people RDC needs to deliver and why this is important for research. So in other words, why would the outcomes of delivered by the people RDC be transformative for your research and for research nationally? I would be presenting a straw person concept for the people RDC and then we will interrogate the data challenges and look at the priorities so that we can shape the people RDC to meet the needs of the national health research community. For today's discussion, I'll need about 25 minutes or so to present the concept and then we will get into, we will do a mentee poll as I mentioned at the start to gather your feedback to some of the consultation questions that were shared with you prior before this meeting and then we'll get into a Q&A and discussion session. So to reiterate the questions that I would like you to consider as you're walking through this presentation with us is have we got the right data challenges and what are the priorities? And once we understand the what and the why we will be able to look at the how and that would be by working with our partners to consider how these challenges could be addressed. I would like to highlight at this point that the delivery of Australia's health research data involves many organizations nationally and there are significant activities happening nationally in health around health data and this includes other capabilities under the National Collaborative Research Infrastructure Strategy that is the increased facilities such as the population health research network and data acquisition facilities like Bioplatforms Australia National Imaging Facility and Microscopy Australia. Just to mention that today we are not exploring the how but ensuring that we have a deep and good understanding of what the challenges are, what the priorities are and how they will transform our research. So historically the ARDC has partnered with the research community to develop and deliver a portfolio of world-class research infrastructure but now digital research infrastructure is so critical to all research that we cannot meet the demand. And so the ARDC's future strategy is based around this concept of thematic research data commons that enable us to support the maximum number of researchers through a small number of strategic priority areas. So if you like a fabric of national research infrastructure capabilities that are selected strategically rather than competitively and co-designed with the research community. So this fabric has both nationally focused capabilities in the four ARDC portfolios that strengthen and support the broader system, the horizontals on the slide and a deep focus on identified national challenges and opportunities, the verticals to provide a balanced national system. And the ARDC as a hub of expertise is really well positioned to drive best practice in the creation analysis and retention of high quality data assets and then to be able to share this expertise across domains. So we are establishing two pilot thematic ARDCs in the 2022-23 financial year and there is $15.8 million from the 2020 research infrastructure investment plan to stop the pilots. Now the teams for the pilots are people which we anticipate will focus on health data given the significant body of work support that we've experienced and track record we have working with health research. And the second team is Planet which will cover environment and agriculture. Now both people and Planet are working titles for now. And so let's look at how we've approached the thematic RDCs. So we've taken a phased approach to developing the concept. The definition phase involved the translation of the strategy into a conceptual model for a thematic RDC that is built on the characteristics of ARDCs initiatives. This conceptual model was then extended into a more detailed model for at a thematic level. Now the decision phase involved the development of the selection criteria and the framework for identifying the pilot teams. And finally, the development phase is where the implementation plans for the pilot thematic RDCs will be finalized. So where we are on this journey right now is the development of the detail models for the teams required the selection of the teams to happen because the digital research infrastructure needs vary quite significantly across research domains. For example, the requirements for handling really sensitive health data is quite different from handling climate data and for modeling and simulation. So we now have a detailed model of the people RDC that we would like to walk through with you. And from our initial discussions we think that we've got the broad set of challenges right, but it is as Rosie said at the start it is a big vision and it is a 10 year plan. So we're really keen to get your feedback on the challenges and to identify the priorities and see where we need to focus our initial efforts. Now just touching briefly on the selection criteria for selecting the teams themselves I won't go into this in a lot of detail but this is a framework being used for identifying the teams. We looked for the alignment between national priorities, research concentration and industry activity to maximize impact. In addition to this we consider the demand for digital research infrastructure and the areas of synergy with other research infrastructure providers. So based on this we've got the two teams of people and planet. Now the delivery conceptually the key components of the delivery model for the thematic RDC these four layers if I start with firstly expertise. So this represents the deep knowledge base within the RDC team related to data, research data lifecycle and digital research infrastructure. Secondly the national or federated services and infrastructure represents those enduring and underpinning capabilities that RDC has committed to for the long term. The third layer around projects these are the activities that we undertake with national stakeholders to develop new digital research infrastructure. And the final component is program governance to ensure that the thematic RDC effectively aligns with national priorities and meets the needs of research industry and government. So now one thing I'd like to highlight here is that projects receive funding for a time bound initiative and sustainability has been a challenge. So the thematic RDC model looks at addressing this. When the outcomes from a project are of national significance, national interest, then we can look at transitioning them into services. So to illustrate this model for the people RDC, this is an example of how we could map existing projects and activities to the thematic RDC model. And a bit of a lag I'm seeing here. So with the, it's in this representation let me walk talk you through this. So we've got the key elements of the thematic RDC in the research cloud represented here. The services that are developed and managed by ARDC, national data assets from researchers which are at the heart of the thematic RDC. The platforms and tools developed by new research partners and research teams. And finally the policies and standards that the research sector agrees upon. We also have the national partners that we're working with on this digital research infrastructure landscape. So in the context of the people RDC, there are three questions we need to consider in shaping this research data comments. What data challenges does it need to address? And why will addressing these challenges be transformative? And finally, how can we deliver a digital research infrastructure to address these challenges? So let us consider the question of what data challenges need to be addressed by the people RDC. Now we analyze the health research national priority areas outlined in these national strategies and we then looked at the research funding priorities across NHMRC and MRFF. Finally, the third area we looked at was the requirements for the translation of research into healthcare improvements for impact. Now this analysis of the priority areas and the funding priorities highlighted some areas of focus. And then the learning health system framework developed by the academic health research translation centers identified key digital capabilities needed to underpin evidence-based health research and translation. So if you look at the learning health system framework, it outlines these data and information systems requirements that are directly relevant to the national digital research infrastructure capabilities. To summarize, there are high quality data from healthcare and other sources compliance with the five saves and fair data principles along with legislative and privacy requirements, data governance, sharing, linkage and analysis, emerging areas around big data analytics and machine learning and finally underpinning technology and infrastructure. So we've looked at the current data landscape to identify the data challenges and the digital research capabilities that can address these challenges. And I'll briefly summarize here but I will go into them in detail over the next few slides. So firstly, data discovery across distributed national data assets, national scale data analysis and collaboration across diverse data sharing and analysis environments applying advanced analytics over distributed data and accelerating and scaling data linkage capabilities to meet research demand. So let's walk through each of these in a bit more detail. The first challenge is around data discovery. So we have a really rich and complex data ecosystem and these national data assets are distributed across the government research and health service sectors. And here are just some examples that are listed of these national data assets. We've got the AIHW that is managing the data coming from Medicare and pharmaceutical data and we've got MADDAP data that's the multi-agency data integration project data set from ABS at a significant common world data assets. We've got at a safe level we've got linked administrative and health data for example, admitted patient data or emergency data. And from the research sector institutions we've got data coming from instruments like imaging and genomics through to clinical trials, cohort studies and clinical quality registries. And finally from the health services we've got electronic medical records. Now the challenge for potential uses of these data assets including researchers and industry members is knowing what data is available, where it is and how it can be accessed. The diversity in this data landscape is associated and the associated challenges that come with that are largely driven by the sensitive nature of health data and jurisdictional and regulatory requirements. So we therefore need to take a federated approach to addressing these challenges. And the definition of federation means synthesizing autonomous data capabilities into an accessible coherence service for data users in the data context. So a federated approach to data discovery and data request is already being piloted through the ARDC's HESANDA program. So HESANDA as I mentioned at the start is the Health Studies Australian National Data Asset Program which is creating these fair national data assets in health starting with NHMRC supportive clinical trials and cohort studies. Now we have an opportunity to build on the outcomes from this program to for a federated service that makes the breadth of national data assets discoverable. The next challenge is in data analysis and sharing. So this is driven again, this challenge is driven by the diversity of secure environments for analysis in the national ecosystem. These secure environments no doubt protect data but they also lead to data silos that constrain data collaboration and analysis. So you can see some examples of the secure environments being used by organizations around across government and research sectors. For example, the AIHW's Shrey, the ABS's data lab, PHRN has show, we have the Center for Victorian Data Linkage has a platform called Vault and research institutions use platforms like Erica and Serb. These environments enable the researcher to analyze sensitive data provided by each organization within a secure service but when researchers are combining data from multiple sources then it becomes challenging to work across these diverse systems. So a federated approach to handling this challenge is enabling interoperability between these systems while maintaining end-to-end data governance. Having interoperability that builds in privacy and data protection by design has the potential to open up significant opportunities for data collaborations across sectors, jurisdictions and disciplines and the data governance is key to ensure that data custodians have full control over the data at all times. Now, while the previous challenge looked at focusing on bringing data assets together for research there are scenarios where there are significant difficulties with collating and analyzing data from multiple sources in a central location. So this slide shows one example of such a scenario where the data sources could be across the health services a research institution and with an industry partner. So in this scenario, for example, say it's a partnership between research industry and health services to develop a decision support tool for the early detection of dementia. The industry partner might be developing a model along with a research institution. Now in this scenario where it is difficult to aggregate the data for the analytics the other option is to take the analytics to the data. So what could happen here is the industry partner training the machine learning model on synthetic data. The research partner could train the model on data perhaps in a clinical quality registry and the health services could then train the model on data in their electronic medical records. Now the focus of a federated approach in this challenge area could be the development of frameworks best practices tools and expertise for distributed analytics that is underpinned by fit for purpose digital infrastructure. ARDC again has got examples where there are projects we're working that we are partnering on that are looking at models for federated machine learning and analytics. Now the fourth challenge is related to the ability to do data linkage at scale. So we have world leading data linkage capabilities nationally at the federal and state level that is already funded through NCRIS. Now for research the ability to link research data assets to these national link data sets is a critical need and scaling up the capacity to do more of this linkage will accelerate both research and research translation. Now data linkage also enables multidisciplinary research for example, when research data is not just being linked with federal or state link data sets but also cross disciplinary research data sets available within an institutional between partner institutions it allows researchers to interrogate data in new ways. So this data challenge could be addressed perhaps by building on existing national data linkage infrastructure to propagate and develop frameworks best practices tools and expertise to scale up data linkage capabilities nationally while ensuring that the highest standards of privacy security and quality are made. So in based on these challenges here's an overview of what the people RDC could deliver. So we are currently bringing national data assets into the commons, what you see listed on the lower half of the slide are the health data assets that are either being developed or made nationally accessible through the ARDC's current initiatives. However, this completely recognizes a fraction of the national health research data assets. So we would like your input into identifying the data assets that need to be brought into the commons. We also anticipate that these federated services could enable stakeholders across research government health services and industry to access and utilize the digital research infrastructure delivered through the people RDC, but also contribute to it. And I would like to point out here that when we're talking about these federated capabilities they do involve technical interoperability but that's a relatively small extent while a major proportion of it will be standards, governance and processes. So in summary, here's the value proposition coming back to the value proposition of the people RDC. So the people RDC will support health and biomedical researchers to develop cross sector multidisciplinary data collaborations on a national scale through federated models that deliver interoperable compute storage infrastructure services along with an analysis platforms and tools and importantly that are supported by expertise standards and best practices. So the federated capabilities, again, these are capabilities that deliver consistent practices, technical interoperability and common standards across what we know is a diverse health data ecosystem. We expect this to be a defining feature of the people RDC. And the key outcomes of the people RDC are these proposed federated services. So again, re-trading that the federation is going to be going to consist of a smaller part of technical interoperability but also significant focus on standards, governance and processes. So this is essentially addresses the what we see as the people RDC could address. And if you were to consider the why as in why will addressing these challenges be transformative? Here are some of the societal benefits of learning health systems that we see from international implementations. The primary one being the faster application of research advances into clinical practice. We do know that that translation from research into healthcare improvement can take anywhere between seven to 10 years. Anything looking at evidence-based learning healthcare system accelerates that process. Additional societal benefits could be around healthcare that is holistic and evidence-based, increasing patient access to clinical trials as well as making healthcare access more equitable. Now, in addition to these proposed capabilities that could in addition to these societal benefits from the proposed capabilities that we've talked about, it could also strengthen our international partnerships when we're talking about enabling international data to be made accessible to Australian researchers but also facilitating collaborations, international data collaborations from a strong foundation. So this is the background to our current thinking around the people at EC.