 So I welcome everybody to the third workshop of the Cassandra consultation series. Today we're going to be discussing the draft information scope for Cassandra and discussing how it aligns with current practices and therefore its feasibility. So to begin I'd first like to acknowledge and celebrate first Australians on whose traditional lands we meet and we pay our respect to the oldest past and present. So again today's workshop is going to run for 90 minutes. There's been a slight change to the program for today. There'll just be two presentations to begin with rather than the three that were in the email that was sent out to you all. So we'll begin with Adrian who's going to update us on the feedback that we've received so far via these workshops and be discussing how your feedback is developed into a draft information scope for the national data asset. We'll be seeking your thoughts on the draft and its feasibility and so our second presentation will be from Martin Ustendorp who will explain how the NHMRC Clinical Trials Centre records and shares their trials data. Martina has kindly agreed to do this and we wanted to give you all an idea of the kind of detailed feedback we were after from this workshop. So that's the purpose of that presentation to give you an example of that kind of feedback. And actually today before we go into our breakout session we'll be doing a brief Q&A after the talk so please make sure to pop your questions into the chat channel. A quick reminder your mics will be muted for the presentations of course they'll be switched off once we get into the breakout sessions. So on screen now at the focus questions we'll be asking during the breakout sessions. You'll have a copy of these and a copy of the survey for today's workshop in the reference document that you were emailed. So please refer back to that. In fact completing the survey is going to be especially valuable for us on this topic as we need to record a lot of granular detail about your current practices. It's not so much your opinions on XYLZ but really taking boxes to you know which particular methods or approaches technologies do you use. So unlike previous feedback forms and surveys that we've used as a lot less free text questions and we hope it's going to be a lot more straightforward and quicker for you to complete. Also thank you to everyone who's already completed the survey. We're actually going to be reporting on some of those initial trends that we're seeing in those responses and we might as well get on to that as soon as possible. So I'll hand over to Adrian now. He'll provide an update on the feedback we've received from the consultations thus far and some of the conclusions that we've started drafting based on that. So Adrian I'll hand over to you. Welcome everyone. I'm Adrian Burton I work at the ARDC. The ARDC is convening this whole set of data development workshops. We've been helped by the AIHW in the methodology of developing the sort of specifications if you like for a national data asset here. And the Hacienda initiative is a broad based initiative. We have input from a steering committee that includes the NHMRC, ARRA, ACTA, the ANZ, CTR, as well as Research Australia and Cochrane Australia. We are in the middle of this process. I think you recall from last time we're here we are at Theme C, our third workshop in this data development series. We started off looking at the research purpose, looking then at the specific data that would be that would support that purpose. And here we are at number three identify that data then water the current sort of existing standards and practices related to that data. As you know at the end of this process we'll be then therefore be able to specify or sketch out some broad business requirements for health studies Australian national data asset Hacienda. And we'll take that in two directions. One will be for further consultation more broadly with trial participants and the research community more broadly as well as taking those requirements to a set of partners, institutions who will be willing to build an infrastructure that could support this. But here we are today in Theme C just asking you questions about the existing data standards and practices. So these are the things that we talked about last time that we're that we'll cut the themes that we're emerging out of the this data development consultation process. But these are the propositions the value propositions that people were seeing to provide some kind of a platform a standardized platform for accessing clinical trial data sets and then to provide more coherence and coordination across the sector around those information products from research projects and that that standardization would really contribute to some quite substantial efficiencies and productivities and even innovation of new types of research that could be supported. So if you remember we then asked you know what kinds of use could be made of such an asset and there was a number came out we've now really highlighted in blue here the ones that have been reconfirmed through the second workshop around you know meta-analysis systematic review replication reproducibility as well as secondary research projects these ones really have been confirmed as as specific key three key research purposes. The other purposes obviously remain important but if you kind of look at it in the from the perspective of the of the workflow of science you you do those three on the left and they would enable you through translational research to do some of the things on the in the right hand side there. Anyway the take on here is that these are the big areas of research that are looking to be quite achievable through an Australian national data asset. Some of the key information needs that people require there they are also becoming there's themes that are coming out of our feedback that the study protocols the individual participant level data and the other descriptions of the data like the data dictionaries that they are the key information needs. From last last workshop people were saying that on top of that they're another of nice to have including something to do with the data quality as well as the the data sharing and availability statements. Specifically we've really focused in on the the value of individual patient data and the requirements kind of suggest that it needs to be clear if the entire data set or a subset or an extract of the data set is available and then really having a look to see whether any of the fields have incomplete or have been so these are really now we're focusing in on some of the specific requirements around individual patient data. Some other key needs there that stuff that would be nice to have is you know the be nice to have all this stuff through a specific you know just make it easier to find them any guidance through this initiative if the coordination and coherence of this initiative can provide guidance around standards that will be a big step forward as would be any templates for recording this key information and then on the right hand side there there's a number of templates that are emerging as the key ways in which this kind of information is collected. So that's where we've got to now is that there is a kind of information scope for the Hacienda initiative it's really focusing in on the individual patient data the study protocol the data dictionaries as well as statistical analysis and unpublished outcomes but it's the first three there that are the the key focus that is emerging out of the consultation so far. There's another a number of other information artifacts that are in already in the ecosystem information that's already been registered with the clinical registries stuff that might be in ethics information systems and stuff that might be taken out from publications related to the study and all of that together is the broader ecosystem and if we're talking about Hacienda bringing together the important data outputs of a clinical trial then it would be great if we could leverage from some of the existing information rather than having people having to copy that stuff that's already for example in a trial registry we don't want to have people to copy that our second time all right so for today's workshop we've got a couple of questions around that so having focused in on these particular data assets now the question really is it's around feasibility what are the exacting what are the existing information workflows what do people do with the IPD with the study protocols with the data dictionaries what is the existing information practices around them what is it feasible to collect these as sort of first class outputs of projects what would people be looking for if you were going to reuse them what can we expect from the data producers as normal practice so there that's what we are that's where we are now in the kind of workflow of this data development what are the current practices around the the information artifacts that we're looking to to build into a national data asset and we did send around related to this workshop there is a survey and we've sent that around I encourage you all to fill out that survey here are some of the very preliminary results it's very early feedback so don't take this as final but if you can fill out this survey this will help to sort of really for us to converge on some of these important outcomes so here's a couple of examples so the question is how do you currently access trials information and these are the at the bottom there the different types of trial information so for example how do you currently access trials information if we look at IPD how do people usually do that by far the winner here is the orange column and that's via direct contact with the clinical trialist and in fact if you look across all the different trial information types orange is by far the winner so far in the early feedback so that's an early sort of trend that's coming out of the the first responses to the to the survey now if you go back to your member we in the initial consultations we were saying that a standardized way of accessing trial data is the key value proposition for a standard and this is starting to show why that is in is that if people are accessing currently accessing trials information it is restricted to direct contact with a clinical trialist so that means that it's really dependent on existing social networks or existing contacts or ferreting out contacts where they exist so this is a really nice confirmation that the area actually is a good value in us being a lot more coordinated and coherent about accessing this information so anyway these are just to show you that there is some early feedback other things which have come out of the early feedback is about so that first question was about you know people who are using information how do they do it this question here is about if you're a trialist do you actually record information about the trial registration about the protocol about the dictionary so in all these areas there is a pretty good indications that people are recording this kind of information except for the unpublished trial outcomes again on the early feedback just as the kind of questions that we're asking here green is what we currently do and blue is what you'd prefer to do so for example for ipd how would you make that available i would make it available via my institution's servers and shared drives is overwhelmingly how it's done for the blue how would you like to do it well i'd much prefer to have a formal repository or via a registry so again here are some sort of early indications in most of the mass between the green and the blue they're moving away from just a drive you know or a usb stick that i keep my stuff on and people would like to move towards some kind of more formal repository kind of arrangement so for today's workshop apart from those kind of questions that are in the in the survey we'd really like you to look at these kind of questions you know what kind of standardization would be desirable or is possible and what kind of elements what kind of what are the key elements of each of these clinical trial data outputs that would be required um so i think that's probably where we'll leave it in general what we're trying to get out of today is the specific scope of the data scope that we're starting to see emerge for hasanda will that miss meet the needs of future researchers is it possible is the second dot point does it align with existing data practices is it possible to to pull that kind of information out of clinical trials and does that information scope require refinement so they're the the key things that we're looking for at this stage of the consultation so i'll just hand back to christin then stop here thank you great thanks a lot are you trained for that and i think at this point we'll hand over to martin and just a reminder that we will be doing q and a if people have questions uh following martin's presentation please do pop them in chat okay over to you thanks very much christin let me just find my screen here um so as christin said i work for the nth mrc clinical trial center of the university of sydney and an expertise center for design and conduct of clinical trials with some additional departments in our team that also look into health economics review translation to practice translational research etc so data sharing is certainly a relevant topic in our organization my role in the organization is a of a trials program manager so i look after a team that coordinates and manages the operational aspects of the clinical trials and in my case for a portfolio of oncology studies so i was asked the question i guess based from the survey so do you record the following information on what standard procedures or templates do you use and the short answer really is yes to all um as a clinical trial um environment heavily regulated so you really need to abide by the core principle of say what you do and do what you say um that has been further worked out in gcp and other guidelines and regulations um so for all the items listed in that question we do have standard operating procedures and for the majority under those we have underlying templates checklist systems etc to to operationalize those if you will so just going through this the various items we register all our clinical trials prospectively that's required by the the editorials and the journals nowadays most commonly we use a and z ctr and we use that questionnaire online so there's no dedicated other tool in that it's very useful i would highly recommend the choosing it clinical trials.gov and some others are used as well we do have a study protocol template um and a checklist and that's to assert and that all the essential elements of a study protocol are captured um ours is based largely on um the spirit statement so um if you're still looking for a template that's a great place to start transfer rate um which is a industry initiative also has a good resource there it's a bit more industry focus but still if you you know it's a good place to start um with regards to data dictionaries or the data structure we capture all our clinical trial data systems in state-of-the-art electronic data capture systems for our better funded larger trials we use metadata rave and inform and those are the let's say the best systems used throughout the world across industry and academia for studies with limited budget we have red cap and open clinical also tools that provides robust data collection and management capabilities and have good audit trails to ensure you meet also your regulatory compliance um we do use um easier of templates or let's say standard data elements and structures um throughout those systems we like to reuse those wherever we can um and obviously all the the systems we develop for each trial have their own database specifications usually uh outlined in an annotated case report form um our templates are loosely based on the cdisk um data conventions such as their standard data tabulation model and their standardized data elements c-dash um not entirely in part because um cdisk only started in the mid-2000s really and an ongoing development it's also not covering every single potential study specific data element but it's again a good basis if you are looking to standardize your data collection across studies um same for its statistical analysis plan we've got templates for that um and trancellerate has a good solution there maybe a bit more comprehensive for your average industry uh investigator initiated study but still um contains relevant information now with unpublished trial outcomes i didn't really think of a specific item what we collect we collect all our data including outcomes in our data systems so it is kind of the not applicable is more see above um when it comes to sharing that that's driven by our data sharing and publication policy that um probably is also the most contentious component um usually you and not many people are keen to share their data until they have adequately published them the results that um are underpinned by that data to prevent you know missing out on the opportunity to publish yourself um whenever we share data we do want that captured in data sharing agreements which can be separate agreements or part of a overarching agreement um as part of the university of sydney our office of legal counsel has templates for all sorts of agreements including that um the resources are put here not so much of a template but it's a good article about the benefits and risks of data sharing and the considerations you should give to data sharing plans and what to include in agreements can be derived from that so that's another good item to look at and then finally um this was not part of the question but christine you asked me to expand a little bit on that on you know do you have also tools in place for actually doing the data sharing um and again yes we do uh we've got an SOP on transferring data um with a checklist that you know ensures you cover all the elements to ensure that your data transfer is secure and that you actually transfer the data that you wanted to transfer in the right format in at the right time um in the general process I guess the data sitting in those electronic data capture systems that I described in the data dictionary they are extracted by a programmer or a statistician usually using SAS and then we use a secure data sharing tool and that can be like a secure file transfer protocol like cloud store type solution um it's a bit there's no big single solution for that because often the third parties that want to receive the data will provide such a solution um just a word of caution there's a lot of free data sharing tools such as Dropbox and before you start using that for transferring individual patient data bear in mind that not all of those solutions have adequate privacy and security protection so do confirm that um just to continue a little bit on the clinical trial when you look at it from an operations point of view this by no means a comprehensive overview but I guess from a non researcher and more looking at the clinical trial as a logistical exercise you can split it up in several phases from early development all the way to close out and at all these stages you need to put in place agreements processors systems tools um to execute that study and when you think of that also in scope of data sharing you know ideally you're already early on identify what your intent is with regards to data sharing so at the ctc we do have a data sharing policy um that's you know outlines on general terms are we in principle willing to share data and that's that's a system a case by case basis um and if there is already an intent for a study to share its data maybe for a prospective mate analysis it would be useful already in the protocol template to define that so that your process for doing the trial results in good data collection that is suitable for sharing and merging with other data um and that then follows through into your data collection tools um a few things I want to go to all of these because of the time but you know some of the key aspects here I guess to outline our agreements on the one hand you know if you are going to share data you should have suitable agreements in place to protect you know from inadvertent use or use beyond you know the scope what you're actually entitled to do with the data um as driven by privacy security regulations the participant's informed consent etc and on the other hand also pre-existing agreements may limit your ability to share data um if you have a commercial funder there may be some expectation that you will not publicize the data other than for your own research before um and that funder had an opportunity to look at it there's plenty of reasons um for limitations there and very important as well in the PIC um the the participant informed consent form that drives the extent of the data you're allowed to share for that patient so a patient PIC should really clearly stipulate what level of data you may be sharing to whom and you know what what regions um then um I guess that is here the data transfer specification which I mentioned is another key thing that you know it's it's important that you have a good process in place that once you identify that intent that it's executed well I guess I'd like to end with um you know reminding everybody of gcp I'm sure everybody is very aware of it but really what it says is that if you come you know set up and conduct your trials properly you already have the processes the systems and templates in place in any case to reproduce your trial and now reproduction is not the same as reusing your trial but it certainly will facilitate that so there is an opportunity in you know clinical trials highly regulated high quality data is usually the outcome so it is probably the type of data sets that are very suitable for reuse but of course there's plenty of challenges with that as well in that first of all you've got limitations on an agreement level there's some significant work in setting up the process and certainly if you hadn't prospectively intended to share the data if you have to set up that system post hoc find out with the you know the requester on what they actually want and make them understand your data it's quite time-consuming and yes Adrian already outlined a lot of the organizations are looking to these processes but they all give their own spin to it so there's quite a bit of variability across organizations that makes it a bit challenging to align those data sets then once you're ready to share them so it's a an interesting field and it's good to see that actually there's a good initiative like this