 So I'd like to welcome everybody to the first workshop of the Cassandra Consultation Series. My name is Kristen Kang. I'm a senior research data specialist at ARDC and I'm the program manager for Cassandra. So today we'll be giving you an overview of the initiative and discussing the research purpose of the data asset. To begin with, we acknowledge and celebrate the first Australians on whose traditional lands we meet and we pay our respects to the elders past and present. So in today's workshop, we'll go for 90 minutes and be split up into three sections. We'll begin with an overview of the initiative by Dr. Adrian Burton, ARDC's director of data policy and services. And we'll then have a presentation on the value of data sharing, specifically sharing clinical trials data presented by professors Lisa Aske and Julian Elliott. This will be followed by a Q&A session where you can put your questions to the presenters. And after that, we'll be moving into breakout sessions where we will be discussing the research uses and needs for data sharing and the potential value of a national data asset and what it could provide for you. So how do you provide feedback during the workshop? As I said, there's gonna be a Q&A and you can submit questions via the chat channel and then in the breakout sessions, we'll be asking you to provide your user stories and value propositions. And you all should have received the workshop reference document prior to attending and there was information about preparing some responses for those breakout sessions. After the workshop, we'll email you all a link to an online survey system as it were but via that survey system, you can provide your own written submission in your own words and there's also a structured feedback questionnaire which will probe into some focus questions. And you'll have basically to the end of next week to complete that. As I said, in the breakout sessions, we'll be focusing on two questions. How would you use a national data asset? And what kind of value would a national data asset provide you? For those of you who haven't already provided your breakout room preferences, you can go to that URL. So tiny.cc slash the sander-a. I'll be posting that in chat shortly and you can put your name down against the topic that you would like to discuss. That's all for me for now. So we'll begin with the presentations and I'll hand over to Adrienne Burton to present an overview of the Cassandra initiative. All right, can you all see that? That was a yes, I'm assuming. Oh, yes, I can see it. I'm Adrienne Burton. I work at the Australian Research Data Commons. It's a national infrastructure facility and we deal with things like data collections, data access and analytics platforms. Storage, cloud skills, policy, everything at the very holistic view of the national requirements for data infrastructure. We're part, it's part of the ENCRIS program, it's the National Research Infrastructure Strategy. It's a strategy that comes from the Department of Education to support nationally significant data assets and services that support leading edge research. The funding comes directly from the Commonwealth government and it's meant to fund some national level initiatives that are beyond local or institutional capability. Today we're talking about a health studies data asset that's part of a big dig saw puzzle of data that's related and overlapping in this initiative. We're focusing in on that middle part, the health studies, it's not meant to be complimentary types here or just to give us an idea of what the focus of this initiative is. In 2019, we did some public consultation with CSIRO on research health studies data requirements and this very complex set of data, possibilities and inputs and outputs of research and outputs of administration all came up. We were encouraged to look at this part in the middle. On the left, you can see it's stuff coming from the health system and on the right, there's other types of inputs to research projects. What came out of the consultation was that there is, at least there are custodians of these big health system data sets. And there are parts of the national infrastructure like the PHRN that give access to these kind of data sets for linkage purposes. For example, over at genomics data, we have a national research infrastructure capability by a platforms Australia. In imaging, there's several national capabilities in the national imaging facility, microscopy Australia. But that in this middle part for the actual outputs of research projects of health studies projects, there was not really a national level capability of custodians and standards and national infrastructure. Of course, all this data is important and actually the value is from connecting all this stuff up. But this initiative, the standard initiative is looking at the requirements of that middle area. When we say health studies, just to give you an example there, we're talking clinical trials, registries, cohorts and other kinds of health study. And we're talking about the outputs of those research projects, the data outputs. So our consultation came up with a number of conclusions. The first one was that actually you can and should create a national data asset from the outputs of these research projects because they can support leading edge research and the kinds of things that were suggested were meta-analysis guidelines, other linkage and scaled up research more generally. We were advised to start with a distributed model because there's a lot of fragmentation of health systems and jurisdictions and there wasn't an existing sort of national infrastructure capability in this area. And the NHMRC were very keen for us to work in tandem with their new program of clinical trials and cohort studies. So what came out of that was this initiative. It's called the Health Studies Australian National Data Asset. It's a program, a three-year program that we're launching. We call it HerSanda for short. It's meant to be a significant first step in this area. At the moment, we're not starting where we want to finish with a unified national service. We're starting with a distributed national data asset. And on the right, you can see there, this is a very common sort of model in at least in information systems. This one is from drone flight management of having a set of services and capabilities that are federated or aggregated together. And you can divide up the problem but still have some kind of central view of it. In order to do that, you need a lot of community coordination and coherence. And this project is really investing in that kind of activity. And today is the first sort of example of that. The key thing we're doing here is flagging our strategic intent that this data is important. And it can and does support really very important research programs but that we're not set up for, we haven't really set that up as an infrastructure concern. So the strategic intent is to stop treating data as a sort of waste product of the research industry that creates journal publications. And to try and think about how to capture some value from this output of the research process. Now, of course we're doing this now but it's rather cottage industry with emails to people and social networks allowing access to some bits and pieces but not really on any kind of organized level. And I think that's the idea of the Hysander project is to get us all together and with a strategic intent and to build towards a more industrial model and to take the first steps there. The initiative is resourced by the ARDC with some of those Commonwealth funds which are meant to catalyze these kinds of national data assets. And we have a program advisory committee. I just did this slide and I'm just hoping I didn't forget anyone. I did it a minute before we came here. We have including I'll say the National Health and Medical Research Council, the Australian Clinical Trials Association, the Australian Health Research Alliance, Australian New Zealand Clinical Trials Registry, Research Australia Population Health Research Network and Julian's right there. And I forgot to write Cochrane Australia as well, I knew I was forgetting somebody. But it's meant to be a community initiative and we are taking guidance from the big national bodies in this area. We've made a few sort of pragmatic decisions around focus. The full initiative is what we've said. It's around health studies and everything. But in the first instance, we're lining up to get at least coverage of the studies that are funded by the NHMRC. We're focusing in on the research organisations of universities and medical research institutes. The first focus area, we're looking at clinical trials and we may well add some focus areas to do with subject area or disease. The, as there are three strengths to the Hussander initiative, we're dealing with data and getting coherence around data. We're building a coordinated set of infrastructure across the nation. And we are looking at the culture, the required culture of what the researchers need, what the patients need, what is the coherent policy environment for this. Those three strands are set out in the project plan like this. We are in the middle of the, today you are at a data development consultation workshop. So we are right in the middle of that first phase there. This data development phase is meant to, you know, get an initial scope and consensus around what the data asset will be. It's a step-wise process. We starting today's workshop is around the purpose of the national data asset. There'll be a second workshop, which focuses on the content of the data asset. We'll look at any shared practices and standards in a third workshop. And in our fourth, we'll look at some of the threshold issues around governance and access, not really data problems and it's not really a data development concern, but there are important concerns around access arrangements, consent, IT, et cetera. This process is informed by the AIHW. We've been lucky to be able to have a partnership with the AIHW. And they go through this process of creating national data assets all the time. And this process is informed by a sort of formal process that the AIHW have. And we have Roxanne Foster and Vicky Benner who are working with us on that. So that's all I wanted to say, except that the output of today's, of this, not today, of the four workshops will be an initial consensus document on the data asset and its scope. And we really encourage you to contribute and answer the structured survey forms that will come out around each of these topics and participate in the workshops. It's very important. We will take these questions to much wider stakeholder consultations when we've got the shape of what we're doing. We'll take that to patient groups and researchers and institutions to say, how should we do this? But it's this initial consensus that we'll be taking to that, to wider consultation. It will also be the input into our infrastructure development program, which will start at the end of the year. We will be providing co-investment into the development of infrastructure. And the requirements of that infrastructure will be the outputs of this data development phase that you're working on. And we talked about standards and common standards and practices. We probably, in the first phase is just a two or three month phase. We thoroughly intend to come back to that because data standards is a long process and we will look at that during 2021, a second and grow the standards there. So I'll stop there and hand on to you. So the objective of today was to get no view of the initiative and to start to answer that first data development question is what is the purpose of this kind of a national data asset? So our next two speakers will talk on that second topic. Yeah, so thanks to Adrian and Kristen for inviting me to talk today. I'm going to set a bit of a broader scene and then Lisa will go into some more detail around individual participant data meta-analysis. So first of all, just to disclose my interests. So I have a number of roles at Cochrane, including leading evidence systems and the Living Evidence Network. I'm based at Monash University at Cochrane Australia where I lead the National COVID-19 Clinical Evidence Task Force which is developing national guidelines for COVID-19 and then a longer term initiative, the Australian Living Evidence Consortium. I'm also co-founder and CEO of Covidence which is a non-profit platform for systematic review and I'm a clinician working in infectious diseases at the Alfred. So I think all of us, I imagine, are aware of the many challenges to sharing health research data but to start, I just wanted to emphasise what I saw as some of the main value of sharing individual participant data. So first of all, that enables new discovery. So Lisa will go into this in a bit more detail but it's certainly true that by enabling others to have access to data, new research questions can be pursued beyond what the original trial estimate have been able to do and also certainly far beyond what might be possible with only summary data. Also in the spirit of reproducibility and transparency, it enables further validation of findings or making it possible for others to reanalyze and review prior results. Through those mechanisms can certainly help to prevent unnecessary repetition of trials and the waste and the risk to additional patients because the additional findings that can be generated through participant level analyses. And sort of linked to that is the ability to improve the design of future trials because of better insights from the trial data that has been accrued so far. And I think perhaps most importantly it really maximizes the contribution that a research data set can make and so therefore really honours and maximises the contribution that participants have made through their participation in that particular health study. I think it's also important to note that clinical trial data sharing hasn't just appeared out of nowhere. It really is the next step and have been quite a long history of increasing transparency around health research over time. It really started with the whole concept of randomized clinical trials. Then the development of clinical trial registration which Lisa will mention, which has of course been critical in developing a stronger culture and practice of transparency. Then the expectation of clinical trial summary results reporting in a systematic and consistent way. And then building upon that now the rise of individual participant data sharing. Just to position this in the broader context, this is a schema of how we see the development and use of knowledge within health systems. So at 9 o'clock you can see all of the activities around producing evidence or generating data where we as Cochran SIT is in this step around synthesizing all of that primary research data. At the moment of course that's largely summary data in the form of journal publications, largely but not exclusively because of the rise of individual participant data analyses. We of course then spend a lot of time producing rigorous and usable reports that can be then used to develop guidelines and other products and services that disseminate that evidence to clinicians. And then in addition, mechanisms to disseminate the evidence to patients such as decision aids and shared decision making. Subsequent to that of course there's a lot of people working on the implementation of knowledge translation, evaluation and improving practice. I think what's very true currently is that this ecosystem at the moment is largely driven by documents rather than data and is characterized by years often between each step which I would argue is just no longer tenable. It's a complete travesty that we are still in a world that is characterized by such delays and inefficiencies. So what we see developing over the next few years is what we call a living evidence ecosystem where each of these steps is accelerated, generated by the development of these kind of core principles and pillars. And as you can see at the core of this we really see digitally structured data. In the first instance that of course can be better structuring of summary data but increasingly we see that that will be individual participant data. And of course around that there are many other aspects that are required in order to leverage that including a culture of sharing tools and platforms, common understanding of methods, et cetera. Also to note that of course we are in an era in which learning health systems are becoming more prominent and a reality. So this is a concept originally developed by Chuck Friedman and others but I think now it's really got a lot of momentum which is really about the way health systems can better use the own data that they are generating within these internal cycles within a particular health system. So data to knowledge, knowledge to performance, performance to data. In this paper Chuck, together with some Cochrane colleagues of ours really clarified that there is a necessary step in which a given health system is also incorporating critically appraised external evidence. Again at the moment that's largely summary data in the form of conventional systematic reviews but increasingly that will be synthesised evidence based on individual participant data and certainly when you start to explore the real opportunities in let's say precision medicine a lot of that really is driven by access to individual data that can enable understandings for perhaps not individuals initially but certainly smaller and smaller subgroups. So I would argue that increasing use of IPD is an essential step in the efficient efficient conjunction of synthesised research evidence and individual health systems using their own data in a learning health system model. So I think later in this afternoon we're going to be asked to think around user stories around the use of these data. So I'll just give you a summary of how I see the potential national health study data asset being of value to systematic reviewers. So first of all as a systematic reviewer I want unique identifiers and accurate structured metadata for each data set so that I can find the data I need quickly and easily. This is absolutely critical. I'll talk about this in a minute. Of course I want consistent data across studies so that I can combine the data so that it's reasonable and appropriate to combine those data from different studies. I want the ability to understand the data so that whether it's myself or in a collaborative group can make valid and efficient decisions about the way that we should be working with these data and how we should or should not be combining. Now that may be in the form of a data dictionary but also importantly through the ability to have conversations with the original trialists. Of course I also want ethical approval for pulling in analyses that are not delayed and in some form I want access to secure and flexible analytical space so that I can aggregate as many data sets as possible and use the best analytical approach and tools. So that's just a very brief overview. I think of the way that systematic reviewers would look at a national data asset of this form. Just want to touch on a couple of points. First of all of course it's not just about the IRPD itself it's also about the protocol, study protocol, data dictionary, the statistical analysis plan, clinical study report if that is available and then finally the IPD. So when thinking around this asset we have to think about not only access to the IPD data set itself but also all the other assets associated with that that really enhance the data sets value. Now just I wanted to take a couple of minutes just to talk about metadata. This is critical to our world. First of all any metadata associated with any research asset whether that be a journal publication or an IPD data set must be accurate. I think it's a fair summary to say that the majority of the time spent in systematic reviews is largely because of poor metadata. If we had much richer and more accurate metadata it would dramatically accelerate the process of systematic review and make the process of systematic review a much more feasible, efficient and cost effective process. So that I think is critical to keep in mind. Must be flexible to different types of data. So as I mentioned just before there are different assets that are of importance when thinking about systematic review. So we need metadata that is flexible to those different elements but also in some ways future-proofed to other types of assets that may become available in the future. Curation is critical. So I'll say link back to the first point. Whatever system we establish for the curation of the metadata that is associated with this asset is going to be a critical point in the decision making. And again, Lisa and others will talk, could talk for quite some time about the challenges of metadata curation in the context of clinical trial registries. And certainly we see the challenges downstream in systematic review. So the appropriate role of the data owner or submitter and of the QA process around that is essential to get right. And please, please can we think about structured and computable metadata. I think in 2020 it's really not tenable to be thinking about just user-generated natural language. We do need metadata systems that are much more structured and computable. So in the end this is really all consistent with the fair data principles, which I imagine is familiar to most of you. And in the end the principle being that when data are more discoverable and more usable they'll enable more analysis and therefore more value to be generated from that asset. So just very quickly to note that at Cochrane we've been spending a lot of time on metadata systems and increasingly we use very broad data assets rather than the individual bespoke approaches that perhaps have been the characteristic of systematic review for some time. So just to note that we're using machine learning, crowd sourcing and a particular approach to structured data called linked data to enhance the metadata that is available, characterising study design topics, Pico and then finally the type of data contained. Sitting under this is a lot of work around an ontology. So this is Cochrane's Pico ontology that characterises the structure of the knowledge relevant to health systematic reviews. That's linked to a number of controlled vocabularies. Many of these I think will be familiar to you. So SNOMED CT, MEDRA, ATC, RXNORM and many others. And we've built an infrastructure that enables us to curate metadata of all of those research assets that are within the Cochrane sphere, which then enables much better discoverability of those assets. So here's just an example of how we can use that structured data to filter by population intervention, etc. We've been using this now for a few years within Cochrane itself, curating our data assets, but now also increasingly using this in partnership with other groups. And just to mention that this is now being used by Vivli, which is a clinical trial data sharing platform established in the US, a non-profit, which is taking a federated approach. So it's a relevant example, I think, for us thinking about our approach in Australia. And that metadata system is enabling more precise discoverability of their data assets. And then just to finish to say that a given example we are now developing and maintaining national living guidelines for COVID-19 using all the systems that I've just described. And through that, we're able to have rigorous evidence-based clinical guidelines that are updated every week. That's a pace that has not been achieved previously. That, of course, is sitting upon largely summary data from journal publications, but I think through the development of assets such as this, we're hopeful that in the near future we will also be able to draw on individual participant data. And Lisa will describe a bit more detail the additional value of those data sets. Thank you. Anyone? And thanks for the invitation. Just to also disclose my affiliations, I work at the Clinical Trial Centre at the University of Sydney, a part of Cochrane in the role of a co-convenient of the Cochrane Respected Meta-Analysis Methods Group and the Manager of the Australian New Zealand Clinical Trials Registry. And from September I will be at the World Health Organization in Geneva. But just to point out that the views I expressed today are my own and not necessarily reflective of any of those organisations. Sorry. Why are we... Can we not progress? Yeah. So, as Julian pointed out, clinical trials registries already do actually enable and permit and allow storage of quite a lot of information about trial registries that stepped through progress that Julian talked about. So we do, from a trial registration record, before the first patient's enrol know a lot about the methods of a trial, records can be updated during the course of a trial to add more information about approval sites and levels of patients recruited, et cetera. We can already, certainly in the Australian New Zealand Clinical Trials Registry, actually lodge many of the documents Julian talked about, the protocol, the analysis plan, data, dictionaries, et cetera. It'd be fair to say they're not particularly well curated or findable, but they do can exist and do exist. And we also permit the uploading of summary results from the completed clinical trials and baseline characteristics, either in a structured format such as clinicaltrials.gov has a very structured system where you have to enter data in a particular way or all the other registries that exist have essentially an unstructured format, which is a PDF, essentially, where you can do summary, lodge summary information. But none of the registries currently or permit the lodgement of raw line-by-line de-identified individual participant data. And as Julian's outlined, this is where we think there's a major value add for this project. It's also should be noted that it's not just again, we haven't just sort of dreamed this up. Since 2018, any person conducting a clinical trial and wishing to report it in the major medical journals that are governed by the International Committee of Medical Journal Editors must submit a plan when they, a statement when they submit their manuscript. And that statement often asks you, you know, what, how, where, and when, and the mechanism for actually sharing data. And many particularly investigator-initiated trialists often haven't had a place where they can actually lodge their completed and finalized individual patient data set. So just to make sure people are clear what I'm talking about in terms of individual patient data, it's different from aggregate data. When you do an aggregate data system, a review or meta-analysis, you go find the papers, then you try and extract information from the papers. Now we've got some more sophisticated systems, as Julian pointed out, machine reading, et cetera, but essentially we're extracting it from papers or publications. Conversely, individual participant data involves people. You actually gather the people. Here's one we prepared earlier some time ago from many trialists around the world and you gather those people together and extract the data from those people. So what we're talking about is actually line by line data, line by line, row by row for each patient and column by column for particular data points, variables, characteristics, outcomes. You often in the IPD ask people to provide it to you with particular format and with particular select variables. What usually happens is you get something like this. Now you can imagine if people, you know, it's a data asset, but is it actually all that usable and the short answer as Julian has emphasised as well is no, without proper metadata and curation of that data and all the other supporting data that goes with the actual data set. So if it is accompanied by the relevant metadata, what sort of things can we do with IPD that we couldn't otherwise do if we didn't have that line by line data? One of the most important things we can do is look at different treatment effects in particular subgroups. And I give this simple example, but it's one in my world. I've come from the perinatal trials world and it's a problem when people publish their data with particular different cut points when you're looking at a particular outcome. So the outcome of very preterm birth means different things to different people in different parts of the world. So in somewhere like Malawi, a baby born less than 34 weeks is a real disaster. Most don't actually survive. For people who say in Venezuela, that cut point is more relevant at a different gestation and for us here in Australia, the cut point is different yet again because we have a different healthcare service and we really want to see different, and we make a lot of different treatment decisions whether a baby is very small, very early, versus greater than 34 weeks. So each trial would have reported those things quite differently and quite legitimately for the setting in which that trial was done. But what we can't do is then combine those data very easily when that's in publication. Whereas we know that for every baby in every trial, the trialists will have recorded the actual gestational age for each baby. So with the access to the actual line-by-line data, we can use a cut point that's consistent across all the trials. And these subgroups and putting the data together are very often the focus of future trials. And it's one of the reasons why NHMRC highly recommends now a systematic review of data as justification for your trial because you want to see if there are particular subgroups of patients where the treatment effect might be what the direction of magnitude is. We can also use IPD to look at different outcomes that were than that was the focus of the original trial. Example here is from, again, the group I showed before where we looked at aspirin to prevent preeclampsia. But some years later, there was much more of a focus and interest in the effect of aspirin on spontaneous preterm birth. And yet rather than go off and do another whole series of trials, we could use the data from more than 30,000 women and babies to reanalyze because we had the information about whether a birth had been a spontaneous preterm birth or an iantrogenic one and could make an assessment about the effect of anti-platelets on spontaneous preterm birth without having to do necessarily a whole lot of new trials. What else can it do? It's sadly, we are realising that fraudulent data is now becoming a problem. Jennifer Byrne, a couple of years ago, my colleague at Sydney Uni was listed in Nature's Top 10 of people who mattered in the world that year in terms of using some software to detect systematic errors and sometimes fraud in genomics data. We've done some work recently with Ben Moll and others at Monash and in the Netherlands looking at ways of assessing data from clinical trials to see whether they were actually unlikely to be real. This was published in the European Journal. This is about information that went from trials that have been used in European and US guidelines. Interestingly, someone posted on ResearchGate confirming what we'd found that in fact the data from thousands of patients were actually not real. We concluded in that trial that really without a peer reviewer working in isolation without access to underlying IPD is very unlikely to be able to detect the patterns that we could detect when we've got IPD available from not only individual trials but a series of trials. Finally, one other thing that IPD can do and add value to is really being able to better investigate the interplay between participant-level characteristics and intervention-level characteristics and thus being able to better tailor or implement these effective interventions either at scale for particular communities or socio-demographic groupings or at an actual personalised medicine level which again I think Julian referred to. We've been trying to do some initial work on that, looking at childhood obesity prevention, deconstructing interventions and then being able to match that against participant-level characteristics in a way that is going to make it more possible to scale up these interventions but in a more targeted way. I'll just finish by saying that we know that most of the time we're doing trials that are too individually, they're too small which is why we do meta-analysis. Here's the median sample size from the initial batch of COVID trials all with median sample size in the hundreds and what you really need is in the thousands if you're going to detect the sort of differences we're expecting. If we put them all together, we would have thousands. Some people have been talking about this for some time basically saying every all data should be open and available to everyone. This is the all trials group. This has been Goldacre. There's usually a lot of television cameras around him as you can see there but on the other hand there is potential I think disadvantages for a completely open system the potential for data-dressing misuse of the data privacy concerns. What we're really talking about here and what Cassandra is proposing is a moderated access to a recognised data repository that can potentially overcome many of those problems. But it does need to be something that has fair principles that save, has good governance and well resourced which is what the group is going to work towards. As I've said, it needs and does have close links with trial registries and data sharing. So I think both Julie's presentation and hopefully what I present to you shows how we need big we do need big data and in this era we need fast data but that data needs to be reliable and true we still do have some bumps and bridges ahead of us but I think this national clinical trials data asset will really be a major step forward in our ability to improve transparency and data access but reduced research waste. Okay, I think I'll finish there, Kristen. I'll stop sharing my screen. Thanks very much. We'll hand over to Roxanne now. Thanks, Kristen. Can you hear me? Yep, perfectly. Beautiful. I'm Roxanne from the Australian Institute of Health and Welfare Metadata and Meteor Unit. I'll be moderating today's Q&A session and presenting questions to the panel members, Adrian, Lisa and Julian. We'll start with some questions that came out of the pre-workshop survey. I'll just bring up the screen. Okay, so the first question is for Lisa. How does Jacinda align with clinical trials registries and other clinical trials management activities? Does it overlap or enhance? Well, hopefully I've pointed out that it enhances. So there isn't, as I said, there isn't an ability to lodge individual patient data on any of the trial registries. So there is some ability to lodge some of the metadata, but it could be better curated and I think a system where you can follow a trial all the way through from registration all the way through to actually getting the individual data in an organised system would be definitely an enhancement. Thanks, Lisa. I'm just going to put a reminder here as well if people could please complete the breakout room preference while I'm posing questions to the panel members that would be great so that we can assign you to the appropriate room. So the next question is for Adrian. How can Jacinda accommodate data governance issues and roles, for example, data custodianship instead of ownership? A big of a short answer now. If we don't get to all these questions in the allocated time, we will answer all these questions and get a frequently asked question kind of register up there. Yes, data governance and access custodianship ownership are all very, very important issues. We in fact have dedicated the fourth, the theme de workshop to exactly these kind of issues. So potentially today I'll just pause there and say yes, we agree that this is a very important issue and it's part of the initial consultation. Thanks, Adrian. There's a couple more questions here for you. How will Jacinda deal with existing data sharing and interoperability standards, technology platforms and repositories? Will Jacinda implement its own standards? Again, we agree that this is a very important approach is to adopt and adapt existing standards. We will have a, again, similar to the previous one, we've allocated a full workshop to that in the consultation process. We will not be creating new standards. In fact, we will be just looking for existing protocols and standards that are out there. We may, you know, at the end of a three-year period, be able to register our profile of those standards to say, well, these are the bits of all those standards that apply to us, but we will be trying very hard to adopt and adapt. Wonderful. So, another follow-up question. How will health consumers and research participants' interests be addressed and protected in Jacinda? Obviously, nothing can work here unless we get, you know, the actual participants on board. And in fact, look at what value this asset can provide to the participants themselves and make sure that they're comfortable with the way in which it's being built, delivered, that access is being granted to the asset. All those things require the input and the ownership of, by, through the participants. I did note that, in fact, we have a whole stream of the Jacinda, that third yellow stream that I talked about, and we will be doing some active stakeholder management with the patient groups and the trial participants. Thanks, Adrian. I'll address the next question to the entire panel. How could Jacinda support interinstitutional research and projects? Julian, do you want to talk to that first, and then I'll say something? I think the, you know, there's an issue in the way that data is made available. So, I guess it depends on whether you're talking about primary research or research on the data once collected. Certainly, I think the vision of this is to support better, more efficient and increased access to data so that large collaborative groups or others can access those no matter which institution they're in. You know, it's a globally kind of fragmented clinical trial data sharing world that we live in with many different repositories, many of which are located on individual institutions. So, part of Jacinda and I think similar initiatives around the world is trying to break out of those individual institutional containers and enable it to work much more easily across institution. Yeah, I agree. And I think it's, I mean, what this will bring that is wasted currently. Again, trial registries, you can find that internationally if people register their trial. Most people do now. The information about what trials exist are there. But so much of the data, I would say 95%, I've done a lot of individual patient data meta-analysis collaborations and I'd say 95% of the ones I've been involved with have tons of data that they collected, but we're not shared, they're not in the publication. You don't know they're there until you collaborate. So, I think, as Adrian said, just trying, as Julian said, I think trying to speed up this whole process is what we're hoping this will do. So, I think it's going to enhance what we already do with the registry, but in a much more timely and structured and curated fashion. Thanks, Lisa and Julian. So, I'll open another question to the panel. Could Hessander support the provision of real-time health data to assist with timely reporting and emergency responses? And the example here is COVID reporting. I think Adrian might talk to everybody. I think the focus here is really on research data rather than data that has been collected through routine care practice in health systems. But as the point I made earlier, I think we have to think about how we can bring different data sets together. So, Hessander would enhance the ability to combine those research data sets with the data that's being generated through routine practices in health systems. Adrian, do you have anything to add to that response? No, the focus is obviously not on hospital admissions and as a general big data prospect. And not of this project, or this year's other areas where we're collaborating in that area. But as Julian said, they've been able to update guidelines on a very rapid basis. And at the moment, it's difficult to link in to those administrative data sets from clinical trials and other output data because they're not terribly well managed and you can't do that. So, that's exactly the... I think we can contribute into that area. I think the culture stream that you talked about, Adrian, the fact that that underpins that, a discussion about data sharing specifically with regards to research data here, but it's that culture of trust and sharing of data in a secure, useful way that I think this whole initiative will also raise those discussions nationally. And just to add in, working with the IHW, we're hoping to adopt and adapt some of the information standards that are used in these national data sets anyway, which will allow, again, as Julian said, this sort of integration into aggregated pools in a much quicker way. Thanks for that response, everyone. It actually is a great lead-in to the next question, which I think you've essentially answered there. Is there potential for alignment between Hysander and the ABS National Health Survey or other health service government data, such as data linkage? And I think you've provided a very sound response there. Do you have any further comments around the ABS National Health Survey specifically that you'd like to note, or should we move on to the next question? Nothing for me. All right, so the next question's for Lisa. How would trials protocols be impacted if the data will be available for sharing afterwards? How would the trials, if it was available? Okay, so one of the most key... I mean, the most key thing is consent, is to make sure within the protocol that people are aware that their data will maybe used secondarily after it's de-identified, profitable governance, et cetera, et cetera, not a free-for-all model. So I think trials now need to be making sure that they have that somewhere in their protocol. And it's becoming more common. And if we're to get to a world that Julian's wanting us to be at living real-time, accumulating data as it happens, we can't be spending what I usually do is spend a year trying to get everything reconcentrated and data-sharing agreements and all sorts of other things. Mainly because in the past we haven't asked or specified within the trial protocol that there would be a reasonable reuse of data to answer the same or similar questions that the original participants consented to that trial for. We think it's a really important part of the initiative and we will be allocating resources to consensus and common and templates, et cetera, around consent that the Hacienda participants can all contribute to. And then you start to use as part of that coherence and coordination that we're trying to, broader coherence and coordination that we're trying to promote through the Hacienda initiative. In terms of that broader coherence and coordination, what sort of incentives and barriers will Hacienda address for organisations in sharing their data? Yes. The key things that can be done within the infrastructure, I'll start within the infrastructure and move out, is the acknowledgement and having that really key and the use of the data referred and referenced, the fact that it exists as a data asset means it can be referenced clearly and publishers can see it and it can be part of the profile and output of the researcher themselves. I think that's probably the key one is to get that reward and incentive system there. So I think the infrastructure can allow it to be referenced. There are a number of culture and policy things to do with rewards when you're applying for grants, rewards when you're in the institutional systems, but I think they rely on the evidence, did this data get used? And I think the actual system can actually help that to say, well, here it was downloaded by this project and we've got a report from them, so we can build that into the infrastructure to say, how has it been used? I think that will be the key incentive in the long run for organisations and researchers to say, well, we can get acknowledgement. I think there is a little bit of tinkering in the scholarly system to readjust from the sort of obsession with journal articles and balance that with other data reuse incentives. Thanks for that, Adrienne. Unfortunately, we're out of time for Q&A, so that will be our last question for now. I just want to thank all our panel members for their responses. We will collate all questions from today's session and also invite participants to provide written submissions through the survey monkey form and, of course, the breakout sessions will provide opportunities to consider more specific use cases. Before we head into the breakout room, so I've been asked to give a really brief overview, given that we're running a little bit over time, of AHW's data development process, which is informing the consultation approach for Cassandra. So what I'll do is I'll take control of the screen sharing and do a quick overview of that. Can everyone see my screen? Yes, we can. Wonderful, thank you. So first, we'll just have a quick look at AHW and our role in Hacienda and consider the best practice data development, which will help guide the discussions in the breakout rooms, which we'll be moving to shortly. AHW is an independent statutory authority. We're established to develop, collect and produce health and welfare-related information and statistics. Our products contribute to health promotion in Australia. The metadata and meteor unit within AHW supports our metadata capability. Metadata includes all the contextual information required to understand data, for example, the data definitions and code sets, and you would have gathered from Lisa and Julian's presentations earlier that metadata is extremely important, for example, for systematic reviews and the like. So our expertise is in developing national data standards to harmonise collection and reporting across Australia. The benefits of national data standards include accessibility, consistency and comparability for data use and reuse. AHW uses an authoritative expert body called the National Health Data and Information Standards Committee to endorse health data standards for use across Australia. The process of building a data set like the Health Studies Australia National Data Set is described as data development. So AHW uses our expertise to assist other projects and organisations like ARDC to undertake their data development activities. We use established principles and processes which are outlined in AHW's Guide to Data Development in order to produce high quality data that meets user needs and builds consensus on the content and the quality of the data requirements. Data development is a methodological process which is informed by a set of guiding principles outlined on this slide. For the purposes of today's workshop, we'll just emphasise two of these principles. Principle three is about being clear about the purpose of the data collection. This primarily involves considering what questions you're trying to answer with the data collection and what information is required to achieve that. So these considerations will directly impact your data needs and development. We'll also look at principle seven. Data development may be incremental. So there may be some information that can be readily defined and captured and other data that's more difficult to quantify or reach agreement on. It doesn't have to be achieved in one go and can be an incremental process. So the early focus on clinical trials demonstrates this approach for Cassandra. It's a relatively well-defined space to start and application can be extended over time to keep providing value. So the principles that were just covered underpin the steps we follow through the data development process will be looking at the foundations for data development today and build on these outcomes in future workshops. So that's a good example of the incremental process. So the foundation for data development is understanding what information we're trying to gain from the data and why and key questions to consider in this is building the purpose and the benefits. So this embodies the principle of also being clear about the purpose of the data set and can be informed by generating value statements. So generally Cassandra aims to provide access to the outputs of health studies, facilitating the reuse of data in research communities for improved outcomes. How Cassandra should provide value for you will be explored further in the breakouts. So we'll look beyond the operationalising technology and explore capabilities and use cases that Cassandra aims to support and consider the identified value streams while also addressing any gaps. So commenting on how these could be enhanced through the Cassandra program. So we'll be looking at the user stories and the positions that will inform later stages of the development process. And I think I'd like to invite you all now to join your assigned breakout room since we're running quite a bit behind time there.