 Just to start us off, I'd like to begin by acknowledging the traditional owners of the lands on which we're meeting today and pay my respects to their elders, past, present and emerging. I'd like to just begin by stepping back and just giving sort of reiterating probably what you know fairly well, but I know we've got others attending to these sessions as well. The Australian Data Partnerships Programme, it sits under an overarching initiative under which a number of programs sit here in the ARDC and that initiative is called the National Data Assets Initiative. And it was designed to establish strategic partnerships across research communities and institutions. And what we're looking to do is develop a portfolio of national scale data assets that support leading edge research. And what we're looking for this is for this to lead to long live data assets that leverage existing research and administrative investment and ensure ongoing sustainability and stewardship. And the principles that we think about that sort of sit under this and underlie this initiative and what we're doing and why we're doing it is that we think about collections of data being national research infrastructure when or because they support leading edge research and they are national in scale. And for those of you who have projects under this program, you'll be very familiar with this because you had to address these other criteria for this program. Okay, so the Australian Data Partnerships Programme, as I said, it's one of five current programs sitting under the National Data Assets Initiative. And it's the largest program, maybe only on some counts, perhaps. We have a number of projects and it is our most broadly focused. So it was our most open program for which you could apply. And it aims to create or develop high quality data collections that support, as I said, leading edge research and a national scale because they have data from multiple organizations. Or they have data that's consumed by researchers, policymakers, NGOs. Or they have governance arrangements that sit over that data that include multiple organizations. And we're trying to support establishment and development of these data assets where the force of a competitive and uncoordinated research market's been impeding the emergence of those assets. So we're looking to step in and fill a gap and a need. Now, on that note, we might just pass over to our first presenter. I'm going to run the slides and just checking that I've covered everything. If our next, our first presenter could unmute and then we can get the slide ready. So you should be able to see so that we've got Professor Claire, magic, and she is talking about the link data asset for Australian health research and I'll hand over to Claire. Thanks very much Catherine and hello everyone. Yes, Linda is our Australian data partnership. Next slide please. Our project seeks to address a problem that is very dear to my heart. The problem is that there's no enduring national linked health data asset that is accessible by the research sector. And as a result, public funds are being wasted from duplicate researcher efforts to access data on a project by project basis. And these data access and approval processes are cumbersome, costly and inefficient, greatly reducing our research capacity. As a consequence, Australians aren't benefiting from timely and cost effective health research. Now whilst we have benefited from increased funding of the jurisdictional health data linkage units, Australia now lags behind contemporary countries like the UK, Canada and New Zealand who have national initiatives, enduring linked health health data assets and streamlined access for researchers. Next slide please. Our solution is the creation of a new enduring national linked health data asset at the Australian Institute of Health and Welfare with researcher access by a secure remote analysis environment. We'll build automated data processing and curation pipelines for the data asset. This will reduce the impulse on data providers and facilitate the inclusion of up to date data. We'll also build a code sharing platform to facilitate and encourage collaboration and importantly reproducibility of outputs. Critically, we'll develop best practice governance arrangements for access and use by trusted researchers. Now we know several other initiatives are working on data governance and we will ensure that we will learn from and align with them. And last but not least, we will develop operational and licensing arrangements for the asset. These outcomes will be achieved through partnership, collaboration and consensus. The key partner is AHW through the leadership of Jeff Nidek and Nick von Sandham. Integral to the success of the project are our partners from the Commonwealth and jurisdictional governments, university and medical research institutes and consumer organisations. Vital steps in our work program include establishing the data custodian and research sector data use and access requirements. Next slide please. Our project will increase access to public sector data by Australian researchers. This will in turn increase the quality and timeliness of our research and surveillance, the practice and policy translation from that research and it will generate improved health outcomes. There will also be benefits for the public sector from streamline processes and automation, as well as significant economic gains from an enduring linked data asset and the process efficiencies. We believe that the partnership and co-design processes will increase trust between the public and research sectors. And finally, the project will increase the competitiveness of Australian researchers and the international investment in Australian health research. Thank you. Thank you Claire. Now Steve is up next. I should have said at the beginning that I did plan to ring a bell to keep us all to time and we have sort of, probably my fault got a little away from time so I will try and be on that. If you hear a bell just finish up what you're saying and we'll keep it moving. So Steve, Steve's the director of the Australian Data Archive and he's going to talk about Anne's lead. Okay, thank you very much, Catherine. So this is, yes, Anne's lead. We're trying to have the best way of pronouncing that. Let's say that it's Australian New Zealand leaders, elections and democracy data asset. So this is a collaboration between, I think we've got eight or nine institutions involved, universities around Australia and in New Zealand, along with the Australian Electoral Commission, looking at how do we do a work around integration and harmonisation of politics data assets. So next slide please. Okay, so in the political science we have a community who's a very heavy data users. They are willing to share, actively engaging and supporting use of secondary data assets and sharing their own data assets. Communities highly ready for and willing to work with shared research data and a number of established data assets covering broad range of scope and time and I'm going to sort of a quick highlight on the next slide. The challenge in the political science is these assets are not well integrated with each other. They are all referring to some relationship within the political system but the crossover between them and even those that focus on a particular type of population or a particular type of unit use bespoke processes, bespoke vocabularies and different types of identify systems and standards. So huge amounts of work goes on amongst each researcher to basically integrate the same thing over and over. So similar to what Claire was describing about repeating these processes, we have public data where this isn't being reproduced. So what we're looking at here, for example, the names of political parties or the identifiers of the electorates that we vote in vary from study to study and election to election so we don't repeat the use of these things over time. So that's fundamentally the problem we're trying to solve is can we harmonise and integrate and make available those sets of shared services and shared identifiers of vocabularies across the system. So next slide, please. So we're working on four broad areas and each of those has at least one or more data sources that are part of this project. So the voting public, we have periodic election studies, political elites. We're talking here about parliamentarians but also candidates. We have elections, the process through which the voting public elects voting elites. Political elites and we have electorates, the groups that they represent, both geographically and in population terms, and they're elected into parliaments. So we have these data sources and we know that they're talking about the same thing at the same point in time that what we don't have is an integration process for actually allowing these things to be readily connected to each other. Each of this is done by hand each time. Next slide, please. So that's the solution we're looking towards here is fundamentally an integration and harmonisation exercise with a shared, you know, a series of shared systems to allow us to do this. A common reference model for the, on the previous page. Shared vocabularies for this type of information based on authoritative sources. We have the Electoral Commission involved for election results. Similarly, the ABS for demographic information and census data and to publish some of these into Research Vocabularies Australia. So how do we partner with, you know, one of the partnership discussions we're having with ARDC is how do we make better use of Research Vocabularies Australia. A linked data model to allow the querying of the different collections oriented the unit of analysis of interest might be parliamentary members so I want to understand the topical, you know, even to the day is how long does a cabinet member survive in their role before they might be removed from the cabinet or from the parliament. Harmonisation of these datasets for long-term comparative research and these comparisons then extend potentially into the international domain. A lot of political scientists are interested in comparative political studies. Why does one system work in one country and not another? Shared storage through the ADA and related organisations and accessible methods based on open access and fair principles. And so the impact, as I say, we've touched upon three broad impacts but I say fundamentally a lot of the political dynamics that we see occurring, you know, both in Australia and internationally are some of the questions we're trying to answer here. So, you know, the relationship between economic deprivation and high immigration tends to reduce dissatisfaction with democracy, bringing together the sources, how to help us to understand the dynamics of those processes to one extent, you know, which are leading and lagging indicators, how do those interactions actually occur and, you know, what's the processes through which particular types of populism or, you know, far-right dissatisfaction emerge? What's the interaction between those elites, the members of parliament, those contesting parliament and their constituents? How does that interaction and the processes through which these organisations and these elites and individuals and voters emerge evolve and how do candidates respond to these situations? And thirdly, you know, does the design of the institutions that we use actually impact upon the types of democratic, you know, attitudes and the types of democracies that we get? So these are, you know, sort of across the jurisdictions here in Australia and eventually looking at sort of internationally into these sorts of questions to understand better the sorts of political dynamics we see emerging both in Australia and internationally. And that's it. Thanks, Steve. We'll hand over now to Elizabeth Wink from UNSW. Talk about Austrates. Is Elizabeth there? Sorry, I guess I could see you. I'll try that again. Sorry, I'm presenting on Austrates and National Database on the Traits of Australia's Complete Explorer and I'd like to start with a big thank you to the ARDC for funding us. We really appreciate their vision of co-investing and co-creating. Next slide. So simply put, the challenge is that there exists quality trait data for most of Australia's 26,000 plant species, but this data is scattered across countless journal articles, reference books, and floras. So functional traits are anything about a plant's form or function that you can measure anything from the maximum plant height, its capability of resprouting, seed shape, leaf size. And these functional traits tell us how the plants make a living. And our job now is to our challenge is to unify this data, standardize the trait names, bring it into common units, error check, and distribute. Next slide. We've started on this path for the past three years, building a skeletal AUS traits, and now we're going to expand upon it and improve data quality and distribute. So there's three ways in which we're going to expand data coverage. The first is by harvesting legacy data, transcribing those reference books, floras, journal articles, but at least import as important is inspiring the current research community to constantly submit their data to us, to make AUS traits the database where they want their data to be archived because they want to be part of it. And third, for a small number of traits to employ statistical gap filling to have near complete species coverage. One of the main ways we want to improve data quality is by formalizing trait vocabulary. So especially for categorical traits, researchers have a very diffuse set of terms they use. We know that a database like AUS traits is not static. It has to work for researchers. And so we've identified a number of researchers across Australia who will be constantly testing this database, seeing how it works for their research questions, and then we can iteratively improve upon it to make it ever better for the community we're serving. And finally, we already have a number of biodiversity portals, research partners that we're discussing with about the best way to distribute this data. AUS traits data needs to be freely available to everyone. University researchers, the government, the general public. Next slide. So simply put, the impact of AUS traits will be to have an accurate comprehensive database of plant trait data that will accelerate many research projects, make them more economical, and make it feasible to answer many broad scale questions about plant adaptations. For instance, if we know how a trait distribution shifts across the landscape, it is much easier to model a species response to climate change. Or following a disaster such as last year's bushfires. Knowing how every plant responds to fire will let people efficiently inform disaster response as well as make better informed long-term conservation strategies. And we also view this as a special opportunity to bring together plant researchers who work in just enough different fields that they might not be aware of the research one another is doing and what trait data each of them has. By bringing all of their data together into one database, it will be much easier to use. And finally, by having that data distributed, as I said, to everyone across Australia, across the world, we will inspire more people to be interested in Australia's plants. Thank you. Thank you. Chiara, she's going to be talking about Australian companion animal registry of Kansas. Yes, thank you very much. A carcinoma, as you said, is who we did. First Australian companion animal registry of Kansas. Next slide, please. Because as you know, cancer is a major public health and economic issue and it's burden set to spiral. In 2018, we had 80 million cases of cancer in humans and approximately 10 million cancer deaths. And cancer is also widespread disease in companion animals, with one in three dogs developing cancer during their lifetime and 15-30% of dogs dying by cancer. Now, complete cancer surveillance data are really important to make recommendations for cancer prevention and control to make appropriate conclusions about the burden of cancer and for designing analytical studies to identify association between exposure and cancer risk. And cancer incident data are best derived from cancer registries that have existed since 1940s in humans. But actually, veterinary cancer registries have been sporadic, short-lived and uncoordinated. So in Australia, we have an enormous gap in this space and I would say worldwide. So for example, in Australia, we are missing 1.6 million of cancer cases and cancer data in companion animals every year. Next slide, please. So our project wants to develop the first Australian-wide registry of animal cancers, a carcinome, to provide a sustainable, unified, integrated and accessible data asset on animal cancers in order to identify trends and patterns in animal cancers. Their geographic distributions have conducted study on cancer risk mapping and also linking the animal cancer data to the human cancer data. So we can do some comparative oncology studies. This project is possible only through the inter-institutional collaborations with other, with the major veterinary schools across Australia, two of the most important veterinary pathology labs, Rebos and IDEX, and the partnership with the Australian Cancer Atlas based at the Queensland University of Technology. Next slide, please. The impact of this project is actually threefold. So a carcinome will allow the development of a state-of-the-art season support platform to guide best practice in animal cancer care. And it would be a sustainable, integrated data asset to perform comparative oncology studies by enabling linking with existing human cancer registries. I don't know if you're aware, but companion animals are also sentinels or indicators of certain human cancer environmental origin. So a carcinome may help in identifying cancer hotspots and uncover associated predisposing environmental risk factors for cancers in both spaces. So you can better understand the impact of this project if taking a look at the breadth of different stakeholders. From veterinary clinicians to cancer research from the human and veterinary side, cancer organisations, policy makers in government who make and the broad community. Thank you very much. Now to Tom Walsh from Tharo. Thanks, Catherine. Yeah, so our title was a bit of a mouthful. The Australian invasive and pest species genome partnership and we're sort of wrestling with a couple of other names to make it a little bit more catchy, Oz pest being my current favourite. I have a bit of animation in here too, unfortunately. So Australia's need, I guess, is to protect our environment, productivity and health from pests. And this is everything across mosquitoes, which are potentially vectors for Borrelia ulcers down in Melbourne, bit of a problem at the moment. Agricultural pests to things like crown of thorns. And a key challenge for Australia is managing the impacts of these pests and weeds and diseases and how they may inflict, or the problems they inflict on our economy, environment, health and way of life. And this isn't just exotic and unwanted plant pests. There are various lists of various things to be aware of that exist across government. And there's a lot of money at stake if we fail to control things. Many of the current control measures involve pesticides or other fairly disruptive approaches. And the future of control and management of these pest species is often requires a genome. So if we want to understand how populations are moving, if we want to look for negative genotypes, things like resistance to pesticides, and some of them, the newer genetic control approaches that are being proposed for things like mosquitoes and fruit flies, which really do require a genome. You have to have one of those before you start hanging around in this space. But there are significant barriers to entry for researchers and also for funders. It's often been regarded as expensive and difficult to do a genome. We've done several in the past at CSIRO and with our partners. And so we have the experience there ready to go. But it's been increasingly recognized across various conservation organizations, management authorities, the RDCs and various federal agencies that being able to understand these pests in more detail through the genome is the future of control and management. Next slide, please. So the solution is locally focused, high quality genome reference data assets. And so what we mean by locally focused is often there may be something out there on some of the big international databases, but it may not be completely relevant to the Australian situation. For example, we think of feral rabbits. The rabbit genome is actually based on a domestic rabbit from Europe. It probably doesn't reflect very well that, you know, the feral beasts we have out in the fields in Australia. And particularly if we were going down to some sort of gene drive, genetic control type approach, we really need to understand what we're dealing with out in the field in terms of, you know, the variability and population structure and movement. So our path to impact, where there is an asset, we will make that available and usable. Where there is no genome or asset, we'll assemble and annotate the genome. The genome where there's an inappropriate genome will create an Australian version and where the genome is maybe poor will improve it. And that will lead through to a set of tools, set of reference data assets and data storage and availability, taking advantage of Syro's sort of deep computing resources that people can then go and use at a relatively low bar to entry. Next slide, please. So our aim really and our impact is there'll be an asset, but what we really want to do is catalyze the research and the applications and reduce that barrier of entry for omics, democratize the use and access to the reference data, and really stimulate research and applications. And if you think of, you know, DNA is just another data layer out there in the environment. Next slide, please. And the aim is to bring a whole variety of different things into that, into scope, you know, by having this layer of DNA. And we have experience of this within our own projects, but we're really looking to be the place that starts off some of these things. And obviously we want to motivate the funding agencies to think about this and to generate more of these assets. And we think this is already working. Next click. So the GRDC, the Grains Research and Development Corporation, is already funding projects with other researchers on the basis of genomes that will help to produce through the GRDC project. Next click. The Great Barrow Reef Marine Park Authority, we're also talking to them about things like crown of thorns, you know, being able to control that marine species out in the ocean without going through and individually stabbing each one with some sort of poison is actually very difficult. And so they're looking for a different approach. Next one, please. We've been able to convince people that there is actually something here and they should invest in it internally as well. And that's it. Thank you. Thank you so much, Tom. And our final presenter today is Alan Mark from the University of Queensland. Thank you, Alan. Hi, my slides are now when you're as beautiful as some of the others. So our project is about trying to consolidate and essentially reuse computational data, which is being generated all across the country. If I can have the next slide. So basically, whether I'm doing drug design or material design, increasingly diseases become dependent upon high-level calculations. Some of these are quantum mechanical calculations. Other calculations involve actually simulating the motions of the molecules. Now, these represent an enormous amount of resource, literally hundreds of millions of hours on the national supercomputer computers. And a lot of these calculations are being repeated. So many individual groups will be running the same molecules over and over again because the data is not being stored in a consistent manner. And many of these are deterministic, which means that it should really not matter what code or how the calculations are performed. So basically our aim is to centralise and organise and share both new and existing computational data, and in particular to be able to make this easily accessible and so data generated by multiple groups at different levels of theory can be at least visualised together. Also as a way of understanding the limits of the methodology. So if you go to the next slide. So what we're doing is that we're actually building on two existing platforms. These were originally developed in my group, but a number of groups already feed into them. And so we actually have calculated data in around about half a million compounds. And our current site has about 3,000 or 13,800, actually it's over 13,900 as of today. It is growing quite rapidly at the moment. Now that's only a tiny, tiny fraction of the molecules which are particularly of interest. So there's around about 13 million compounds which are available for you, say, in drink science research. As part of this project, there's a number of other databases around the country which store similar related information that they tend to be we have uniform information over a large number of compounds. Other places do multi-level calculations on a smaller number of compounds and bringing these together is going to increase the overall value of it. We also, in the same in terms of simulations, there's literally thousands of terabytes of data which is spread across the country of simulations where there is no way at the moment that either the people can identify them in terms of, say, document identifiers or even to be able to obtain these without writing to the individual researchers and requesting it. And so the main thing, the other thing that we're very interested in doing is that we have a constant storage versus regeneration cost. Some of these calculations can be easily reproduced. Other ones can't. As time goes on and computer power increases, we need ways to automatically curate this data so we only store what's required in order for it to be regenerated if, in fact, with increases in speed, it becomes more effective to regenerate than to actually store. I might say, too, that some of these calculations run for six months or 12 months on the national supercomputers. So they're not trivial things that you decide tomorrow to recalculate. If we have the next slide, please. So a lot of what we'll be doing is trying to validate and facilitate the development of methodology. A lot of these calculations are used as black boxes. We don't really know whether different approaches to perform the same calculations give equivalent answers over large numbers of molecules. They're central to the interpretation experimental data. And also, because many of these are repeated, that if we actually provide intermediate results, then people can use this as a springboard for their future research and stop a degree of duplication. So basically what we'd like to be able to do as well is that for these calculations to be done once, to be done at a high level, and that this is then available for everyone to use. The major benefits is, in part, to accelerate computational drug design. That's why we went into it ourselves. What many of you may not be aware of is that in many cases, we don't even know the form of the compound, which is actually active. Say, in the case of warfarin, there are 40 different forms, interchanging forms of warfarin in a person's bloodstream. And we don't know which of those forms are actually active. And furthermore, the ones that people base their design on may be completely irrelevant to the actual function. And this is one of the reasons why computational drug design, and you can see it in the current pandemic, has not progressed as rapidly as one would expect. Finally, I'll just say that the ATV, which is one of our databases, is already one of the largest of its type in the world. It's already used for machine learning and data mining. And consolidating across Australia will not only give us the first bite of this data, but also that we're actually attracting data from overseas to actually make, hopefully, Australia a place where people will come to test resources by actually giving us the data in order to get it into our databases. Thank you. Thanks, Alan. And thank you, everyone, who's presented today, and that brings us to the end of our presentations.