 Alright, good afternoon everyone. Let's start. Welcome to the webinar series in research data information integrations. This is the third of our series. Today we're going to be looking at storage for research data. My name is Paul Wong. I'm your host today. My colleagues Susanna and Xiaobing will be hosting today's webinar virtually with me. Research is certainly changing globally. One change is that there is greater emphasis in accessibility and reuse of data and better management of research data leading to better research in the long run. Now a unifying theme of this webinar series is the idea of the research data management life cycle as I show you the next slide. We have two previous webinars, one on research data management planning, one on ethics clearance for research data. So today we'll be talking about storage allocations for research data. Now through this unifying theme of research data management life cycle we want to get a better understanding of how research data information is integrated throughout the life cycle. That means that we need to look at the connectivities of different enterprise systems to support the management of research data. Our speakers today are from Deakin University's, Christopher, who is a research director, a big key senior librarian from the University of Newcastle and Jay, manager of research data services at James Cook University. So I'm going to pass the controller to Christopher. Today I was just going to run over what we're doing at Deakin in terms of this space of integrating storage with description discovery is where I've loosely described it. So I'll start the presentation if it's going to advance for me. So we've got a fairly loosely coupled ecosystem to handle this at Deakin, which is great. It's flexible in design but as I've said there it causes a lot of confusion in practice and I've had a lot of problem getting researchers engaged with the fabric because it is quite confusing and you'll see with the diagram I'll present in a couple slides later what I mean. So just trying to disambiguate some of that and clarify what the tools are about and how they can actually assist rather than inhibit publishing of data and using the storage is really what the focus is on at the moment. So in the described space we implemented Redbox Mint under the two in the commons and other ANDS funded initiatives and we call that research data footprints to describe the footprint of your research. We've got the discovery layer so that repository isn't what we present to the world at large. We feed that into our Asher Thunder Research repository which is called DRO, Deakin Research Online and then that's what's research data Australia harvests for the individual records and the actual data that may be shared in an open way is made visible through a very simple, very basic portal called the Deakin data portal which is basically just an Apache server on top of the data itself and I'll show a demo of all these things later on to expand those screenshots. So when we were implementing this metadata repository we also implemented a research data storage system which allows researchers to provision storage themselves. We didn't have any strict requirements on that so anybody can create a bucket to store data but it is aligned with that data portal when a research is ready to. They can publish the data itself and it will link those things together and that's what allows it to be exposed to the data portal. So how does this all fit together? This is the diagram I was talking about just before so we've got various components and I'm sure most of you would be familiar with some of these systems in play but basically the management system is the source of truth for project and party data around researchers that feeds this repository I was just talking about. The storage system can be, you can create storage and choose to link it to a project or not. We're quite flexible with that because we understand that the actual process of writing a grant can actually generate a bit of data before success outcome so we didn't want to dissuade people from using the central storage that we've got on offer and really it was also a carrot to stop people buying external hard drives and storing data locally on their machine. So having that resilient storage in our data center was a pretty key point for that service. And then the rest of it's pretty pretty familiar to most of you so we mint DIYs against every data set that's created and expose that through to this fabric down the bottom. So it is a bit of a quagmire and does cause a bit of confusion but with the presentation layer which is our focus on the moment it is limited in that it's just a bucket of data and we're just presenting it as a list and so the benefit to the researcher is limited and that's what our focus is on now is looking at well how can we better make people aware of this storage that is available and how it should be intended to be used and how can we better display some of the the data that people are generating. At the moment I'm getting a lot of people creating storage containers or collections and just backing up their whole hard drive to it and there's really no description and delineation to what how they're describing things so it's really identified to me that there's this pretty poor practice out there in terms of how people structure what they're doing and so that's where our library staff are helping out a lot in that one-to-one or one-to-small group discussions around how better to describe and manage data in the broader context. What I was also going to say there is we've got a portal at Deakin called Deakin Sync and we're looking to provide some context to what researchers are doing there around storage and so when one of the ideas is to link, present to the researcher if they've got a successful grant outcome, present to them the option of creating storage if we know they haven't linked it to that project already. It's because we've got all that metadata there we can actually leverage quite a lot so with that portal we can provide a lot of value and direct all the researchers to go there to say okay well you may want to be creating some records because we can see the project's been running and it's near the end of its life cycle or at the earlier stages to actually create storage to put the data in that you're planning to generate with that project. The other options in the presentation layer we're looking at are discipline specific or quite aggregated systems that allow you to display data for various different disciplines so we're only just starting to look at how we can integrate these things into this platform or this ecosystem and some of those things are like a meager for all different disciplines that may want to create collections and manage them themselves and use that as their presentation layer rather than just a bucket with a an Apache index on top of that. FigShare and Mitardus are bringing around image data FigShare being quite general and looking at FigShare for institutions and how that could potentially play a part or MediaFox were still really investigating all those different options. So that's the real ecosystem and I didn't want to go into too much on that and really wanted to show you how it all sort of functions. This is the red box system we have and most people would have seen that in the past. It's allowing you to create the data descriptions as we all are well aware. What I wanted to show you here was the process we go through for each of these and how the DOIs are linked into the actual the data portal side of things. So when the process is they create a metadata record and then when they're ready to publish the data they click a publish in the in the store which I'll show you in a minute and then the links for that come into here and it publishes this data portal link and you may be able to see on the screen the URL down the bottom which keeps those two things in check and then when you go to view that actual data collection you can then see it on this data portal and which one was that the interview data for some Papua New Guinea audio interviews. So we replicate the metadata from that footprints record and actually show the contents here to be able to download if you want to but it's very very basic. There's no packaging of that which would be really ideal. There's no thumbnail sort of view of that so really you're just downloading in that first example there 800 megabytes and then you can actually understand what it's all about. So exporting the metadata of that mpeg file in this case is not really done at this point and that's where I'm wanting to get some improvements to present that better. The data store is this system here it's just a web application that hooks into our corporate storage that we have available and what we've done is provided four collection types and we allow researchers to create those activities they can link those to a project and then they can create these buckets to store things. So they can create a traditional network attached file share which is these little yellow icons and they can create any number of those. There's a nominal limit of 10 but they can create any and they are unlimited they can put as much data in there as they like and that uses our what technology we're using now we're using islon storage for that so it means this snapshot's taken three times a day one snapshot at the end of the day for three months so they've got complete ability to restore files and manage their data very flexibly. There is another one called a publishable file share so when they're ready to publish data they can create one of those it's no different in terms of the technology but it allows you to hook into the actual full-prints record and then that little data portal link happens. The other one there with little star this is an icon for a product called simplicity so we're providing a Dropbox like service because we needed a lot of researchers working with external parties and they've got a lot of issues sharing data externally so they can use this service now to do provide that so that's using our own on-premise storage with a Synchronite or Synchron Share platform on top of that so it gives them unlimited storage although unlimited in since that you need the storage on your local computer for that to really function but it has has been very it's taken up quite rapidly because people really want that capability without having to pay for a Dropbox account and use that storage and the other collection which I don't have in this demonstration activity here is a wiki space so we've got a Confluence wiki instance which they can use to collaborative work internally and so the store the research data store has really gone from storage as in storing data to actually a store as you buy things and so that's going to expand we'll be providing a whole lot of other services through this research data store so blog engines and omicron instances and a whole lot of different things will be provided through this one portal for researchers and it all be tied together under this this activity or this project banner so a particular example I was going to show you is the is a Pacific Sea Star but Mark here's got a some sequence data that he's produced and he wanted to make it open so he's gone ahead and published that he's created the now Fez Fidora Asher repository record through our footprint system and then he wanted to share that to the world so originally he was working with the library and they stored the objects within the repository which wasn't great and so now they're provided through the data portal and so you can download the gigabytes or megabytes in this case of files and one thing I'm advising researchers is to really be descriptive about what that is I'm sure people in his discipline understand what all those different file formats are but it doesn't really have a overview sort of read me file it could describe it better so we're working with them on that and that's presented with that hookup through that that link there and also is a link that's available here so you can actually be taken straight to that record all the DIYs mapped through to our research repository so footprints really is just a collection gateway that links those things together and allows the the record to be curated as accurately as possible so really that's all I was wanting to cover off today can chris talk more about the publishing function absolutely so really it's it's we call it published but it's um it really disformers is a link between the two systems they create a publishable file share which is just a network attached storage location everyone should be fairly familiar with network that storage is just a network drive and so they would just have a folder like this to store things let's just say this workshops one for example is something though it's all that would structure their data within that space that's completely offline it's not exposed to anyone other than themselves and then when they're ready to publish um the data I'll just see if something this is uat so the I might get some errors but when they're ready to publish the data they can then click a publish button quite simply here's good it's ready to go so this particular folder which is fictitious because it's uat but when they're ready to publish they literally do that it will then look at all their footprints records and provide a list of ones that they that haven't been published and they can just choose that so in this example here I've already published against this other one but this one here I could potentially do that and then I can provide global access so just say yep anyone can get access or I could restrict it to an AF member so in some way you could limit down to anyone who's a member of the AF to who could see that so that's sort of semi open in terms of this collection and then within a few minutes that collection would be exposed through that data portal I showed you before so you would see it would appear here or if I logged in and it was restricted there would be more exposed once I'm logged into the system so anyone in Australia can log into this this data portal as you can see and then see that so that's how that that's that's working all right Chris there's a whole bunch of other questions that are coming in as you do that and says what's the maximum storage space the research you can request is there a maximum did you say yes what is the maximum none it's unlimited well I'm sure they'll all like that one so our IT managing the growth and acquisition that has to happen and they deal with that as it goes so yes it's completely unlimited so this next question probably ties into that which says what's the cost of the implementation what do you have a data storage costs so there's no explicit cost it's covered under our central capital expenditure on storage so that's just factored into all the storage that the university buys so there hasn't been an explicit cost for this particular service at the moment there's just about what are we up to 100 terabytes with another 60 at another site so nearly 200 terabytes is what we're looking at so not overly large we don't have any astrophysicists with a petabyte in their back pocket so it's it's probably relatively small to most institutions but it is it's covered under that so they they provision that under systematic procurement throughout the year so every time they're they're always negotiating a new price for that storage so I don't have to worry about that which is actually a luxury risk position to be in so probably that ties into that is a couple of questions which sort of meld together one is says use of storage by external to deepen users most collaborations in our national or international so is this possible that the external to deep users can use it and there's another question very similar which says is this service going to be available for researchers and other universities and is there storage size limitations for that so the first bit is covered under that sync and share service where they can they can provision it so a decan identity can provision it and share that with colleagues they're working with other institutions which but there are limits to that because if you're synchronizing to your own computer you need to hard drive storage on your own computer so there are limits the traditional network attached storage any decan identity can access that because they can credibly pin connection but the external people can't so the way that's traditionally being handled at decan is we often make those collaborators we need it as a visitor to the university and then they get access to the to the storage that's a little bit cumbersome and I don't but most people know how to work around that and then follow that process and the last question no there wouldn't be the ability for non-decan people to create the storage space in the first place it really has to be instigated from decan's perspective okay another one is are the researchers able to meet DOIs by this publishing method yeah so in the footprints system that's where the DOI is minted they are done by the library so the library when they're performing quality checks on the description are performing the step of minting it so it's done implicitly in that the workflow of a metadata record is curated by the library and they're the ones actually doing that but it's effectively it's a business to business transaction that happens on every one of those records so yeah the research themselves don't but the library does it on their behalf okay and then probably the last one so we can keep to Paul's time he says is that all the data stored on decan infrastructure or is it stored on national infrastructure or local e-research provider so yes it is all on decan infrastructure so amongst our four main campuses we've got two data centers and it's stored on the the data within those data centers and replicated on those two so we haven't engaged with the the RDSI provision storage it's all purely within on-premise which our research is like because it means they can particularly if it's sensitive data they can tick a lot of boxes in terms of their compliance that they need to ensure work thank you thank you for all those wonderful questions our next speaker is Vicky so I'm just going to pass the control to Vicky thank you very much for the invitation to share what we're doing here at Newcastle so I'm just going to talk about from data to discovery in terms of our research on storage and the connections that we have so in telling the Newcastle story I'm just going to tell you a little bit about the systems and the tools that we have and then talk about the three workflows that sort of make up those systems and tools and the connections and integrations between those so to tell the story I'm just going to introduce you to the systems and the tools that are in this space at Newcastle so for research data storage we have Home Cloud we have an enterprise version of Home Cloud here for data archiving and publishing we're using a tool the software app that was created to run on our cloud that's called Crater for data management and registry for the data management metadata curation workflows we're using red box and mint similar to what Chris was just talking about and for publish and discovery we're doing that via our institutional repository which is Nova here at Newcastle so I'm just going to talk a little bit about the workflow and how they all connect and just describe that to you and after I've done that I've just got two small videos short videos that just actually just show you that in action so you can tell you about it and I'll actually show it to you unlike Chris I wasn't keen to do a live demonstration because that would probably go wrong so I'm using it now so it's a research so the word the very first these are the three workflows I'm just talking about the connections between the two so our research data storage so in that is our own cloud which is enterprise version I think it's seven that run at the moment and that sits on I think it's a petabyte and on that we have this app which is Crater and Crater was developed it's made in 2013 we started working on Crater it was born from work that Peter Sefton was doing at University of Western Sydney or Western Sydney University I should say at the time and it was a collaboration between University of Newcastle Western Sydney intersects who are doing the development and in those early days University of Sydney as well so Crater was about the problem that we had identified in the library about wanting to have this connection with the research data storage to look into our data management and publishing workflow for Redbox and the Mint so and since Crater was developed the development started way back in 2013 there's been a few development cycles along the way so there's been a few sprints and agile developments to get to where it is now and also there's some future development coming which I'll tell you about at the end so you know in that workflow that research data storage work that's what's sitting there in the data management publishing one we have similar to Chris Redbox and the Mint so the Redbox is our metadata stores descriptive curation workflow and it's hooked up to the Mint which is our name authority service for our party records our staff members our researchers and also for I forgot what I was going to say our grants sorry that's what I was looking for information about our grants and then that's connected to Nova which is for discovery sorry I'll just so just run through quickly so in the research data storage workflow that first one now what researchers do there or users of it they log into the old cloud environment that we have they create a crate crate is a data crate they add files to that crate and the files are the files that they they work in files from there that they have on own cloud so they add it to the crate then they have the opportunity in Crater to also add metadata and from there they can review the metadata and then they can publish the crate when they publish the crate that moves across a couple of things happen one of those is it comes to the library into the next workflow that data management publishing and the researcher receives an email with a lot of it's metadata and then in the data management and publishing workflow the one sitting in the middle there that's where the library works on the metadata that's come across the alert that's come across from Crater so that alert arrives into that system the library works on it so we augment the metadata and we add metadata and we probably have more conversation with the researcher to actually work on permissions and probably more on descriptions and when we're happy with that we publish a record and it goes across into Lola what discovery up through research data in Australia so this is Vicki's highly sophisticated systematic diagram so it's just a way of very simply demonstrating sort of what's happening so we've got own cloud the researchers are in there with this it's a storage they're working fast I should say that own cloud is just one of the storage options we have at Newcastle but if you want to have the connections to publish own cloud is where we have the ability to do that in capacity so from the Crater tool two things happen when a researcher uses Crater and they publish or submit a data crate they press the button which I'll show you shortly two things happen so a metadata alert goes across to a red box system and it's like the butt of a record so it's an alert that has information that's been collected while the research has been working in Crater the second thing that happens is that the data crate itself so a zip.file that uses the bag it specification that came out of the California digital library that dataset then actually data crate goes into our storage a storage layer so the metadata alert goes across and it's ingested into red box so more work happens there in red box to all met that and then from red box we send a mark a dc a rift cs record across to Mova embedded in that from that metadata alert information all the way through that process traveling with it is the url to the data crate in the storage like so institutional repository has a priv interface into the storage layer so it's able to be the gatekeeper for the access to the data so if it's publicly available it's only publicly available to Nova through that priv network access so just quickly this is a very three minute video quickly just demonstrating what I've just told you in terms of I'm proud of Crater so the researcher looks in they see all their files in I'm proud they're able to toggle up and they'll see a little icon that's called Crater they can by default they have a default crate for their data but they can create a new one so I'm going through the process of creating your crate so this is my study on green fox so it's flat in the metadata to go and this is a description on my crate fix my typo and I click to create and now I have a crate this is told me up the top there in yellow that I've got a new crate now I'm toggling and I'm going back to my files on own plan and now I can add my right clicking um the credits will let you add to the crate so I'm just adding in my data dictionary my population information on frogs my environmental information and I've got some images but basically you pick and choose what it is as the researcher you want to package up goes into that data crate and when you've finished it's telling you as you go that it's adding things to the crate so you can actually see it so we'll go back to Crater and which you'll see the files have gone into our crate now over on the right hand side the researcher or the user has some ability to add some metadata around those files that will go with that crate and this is going across to start the the button of a record for the library to augment to the publishing so there's a few things here so we've created information with the title the creators we're just adding them now it's hooked up to our mid system so it's actually going to look up against the mint and it's bringing it back it's hooked up to the mint again it's searching for grants so we can select the grant or pull that information back in and so forth so there's small work going on around what actual metadata should be here which I can tell you a little bit about there's a feature to check the crate to make sure that all the items are valid it's still there since you added them if you want to hit the button to submit you get to review all the metadata that you've entered at this point you can go back and change it or you can hit the submit button and that submit button is there sending it through from the research that's going to storage the data crate information to the library so you can send an email to additional people that you're working with to say that that's what's happened so that's how Crater is running on own cloud the researcher can also the zip up the data crate a copy for themselves and also I'm saying they'll grow if they want to so the submit button has done two things it's set that crate that data crate or that data set to storage for archival purposes and then it's actually sent it across to the library so this is our red boxed instance which is not publicly available to children at the library user so it's fairly as is and I'll just start the process to show you what happens here so when you're logged into the system the very first thing you see in the alerts is alert that's Hunter Valley green study study of green frogs so that's arrived in the source next to own cloud dash credit so it's telling the library where it's come from storage and it's arrived so we start the process of looking at that record we go into it and then we start working on it and Chris showed you before there's various things you know the library works on so we can add lots of information there through a conversation with the researcher as well so that's basically how it works this is just demonstrating that the information from the crate comes over is populated as as put in by the researcher into various sections in this system when we're finished we hit the button to publish the record so this is where we do it and the record is published across to our institutional repository and it just shows that it's actually being published so finally after the published button it arrives in Melbourne so it has behind the scenes it's sending receives over as well and it's harvested from there we harvest it up to research data Australia so I guess the last thing I would say so that's the process and that's the three workflows and that's how they're connected from the research storage and cloud through the red box link to the library through to discovery on the other end which is facilitated through NIVA with that connection back into research data storage if it's applicable so lastly I would mention that I've said that there's been a number of iterations with development on the crater tool and aren't currently funding further development and enhancements to the tool at the moment which they'll be trial and with cloud store plus so if there's a group that's working on that so obviously aren't interested to doing the development and also a university of western Sydney and university of Newcastle because we've been working on this quite a while now so that's the end of my presentation thank you very much fantastic thank you if anybody has any questions can you put them in the this one here already it says once a project is complete and all cratable data is packaged and up packaged up and published and archived how do you ensure researchers go back and delete all remaining redundant data in only cloud I have to actually um it'd be a policy or a business rule within IT and I actually don't know the answer to what we actually do here when you're selecting the files to add to a crate picky what does it track where they are you can if a researcher moves them around does that then become disconnected from the crate yeah so it was fairly quick so I didn't talk about it's fairly quick on the screen I flashed up there was that one of the icons across the top in the navigation and credit was a a button that was checked and demonstration I just did it and validated but what the purpose of that is it's it's like um it's checking to see if names have changed the files have been removed and is it exactly that because it's referencing where those files are from the files actually you know so I presume then the advice would be to sort of structure a location pretty much where you're going to have it and set it and not change with it too much yeah but if you do you just have to do a little bit more work when you're going to package it up all right so thank you very much for listening to me this morning today I'm going to focus on just like everybody else storing accessing and exposing research data at at jcu and storage we have quite a few different options that we make available to different researchers so um we have hpc so all researchers can apply for an account on the hpc and and depending upon what they want to do they can use it for just storage or also for for compute purposes jcu is very fortunate to be an rdsi original node so this gives us a two petabytes of of disk storage that we have here here and access to the rdsi storage is available through an application process and we tend to encourage people who want access to larger disk storage to apply for an rds application and the other storage we have is a system called research data which is really it's it's a red box it's publicly exposed and this one's designed for completed data sets so that as there's a self-submission process the old workflow that the users can go through they so they can um complete their requirements they can attach files with a total size of up to 50 megabytes so this is typically things like um excel spreadsheets um and zip files that we normally see so i'll just move on to my next slide i was the other thing i was about to say is um every research can also store files on a system that need to be kept private um and we can expose them in different ways as well depending upon which system the users like to use so for access again hpc standard access applies um ssh scp ftp some of this um is um be challenging for some users so we try and use other systems to make access to the storage easier um and this is very helpful to us um we have we can i guess for for rds storage we can mount that on the hpc for processing or compute access um we have quite a large number of users here at jcu who are making or using asparagus shares so for those of you who don't know this is um web-based access um to rds storage and this is this can be for tens of terabytes of data if you wish this has been very helpful to some users in that if they're at a um at a location where connectivity is poor asparagus shares has been able to give them a good throughput in terms of loading their data and accessing it there is also a functionality to provide a sync type um functionality using asparagus but as um Christopher pointed out quite earlier um it's dependent upon you having the local storage available especially with you if you deal with many terabytes of data media flux gives us lots of options uh we're focusing on portals functionality for media flux and we're currently working with architecture on improving this so it's a way that we can quickly create a mini portal to expose research data and to have um access restrictions on that and we can also um create virtual machines um to expose research data by different um different websites if depending on the projects or the requirements of the user oops and and as I said the other one the other option is um I mentioned earlier is research data where they can attach the 50 mix or up to 50 mix uh exposure so this is where we tie it all together and mostly at JCU the system for exposing it is is that is research data which is our red box instance uh so it's probably available and um there's a feed that happens once a week where Ann's harvest the records for research data Australia there's another system called the JCU research portfolio that is used and records from uh research data have a um displayed under a tab on um research portfolio and this is to provide information about JCU researchers but also to see what sort of research data is available from those researchers and um the information in the research portfolio is built using um the JCU research management system maybe I'd just like to give a bit of a quick demo if I can switch to my web browser so just to try and show you how it all ties in here's our publicly facing uh red box instance so I've pre-search for a record that I've know has got some links to data so we just rely on the researchers adding URLs to explain to expose where the data may be and um in this example here there's a public link to where the actual publication has been made but the data is stored with that publication and also here there's a link to um inside a research which is and actually data sitting on our HPC so the user can then download those zip files and again if there's something similar for data on our RDSI we can expose that data using a similar method so if we I'm going to show you just jcu.me so this is the research portfolio so if you get a jcu.me redirects to here so you can search for a researcher so um we can go just use Jeremy van der Waal who has lots of records and um if as I said if they have any data in our in our web box system or research data this tab will be generated and you can um select the records from in here so what we can do is then click on the record and it's very just I am live so let's have a wait here we go so this is just a um just a listing of the information you would see in red box and if you wanted to you can go off to the actual red box actually this is a this is the actual data so here's just a directory listing of the data that you can download as we've probably all seen this before but um here are just the records that Jeremy has in research data Australia I'll just pick um let's say some of his bird information so we click on data provider so same links similar topics I think I'll leave it there thank you I'd like to open it up for questions please through for you yet but while we're waiting for ones to come through from uh for Jay um I'll go back to one of the ones that was for Vicky which was do you how much training support do you offer for your staff Vicky in terms of um what you said I'm just going to I'm assuming I'm lagging correctly that you're talking about researchers in terms of the use of cloud and and credit um so what we did to kick that off a little while ago actually it was last year was we ran a um a workshop where we got actually we were um introducing own cloud and actually I'm trialing uh credit with a lot of researchers to come and and if the purpose of giving feedback so we did a lot through that session um we have a there's online help and a guide and in context in own cloud and it's in for credit so at the moment we've got four to five somewhere between 400 and 500 users uh on um cloud and as part of that process they have to go through a um uh an orientation session so we're actually trying to it actually uh deliver that to them through that location to get it and then there's this um session that they go to and we're trying to just transfer that online so just to actually get some information to orientate them and then they get access to it so okay that's what they should be not to do with the time they start yeah okay um one back now for Jay since do you link AAF credentials with LDAP for HPC use no we don't so uh our HPC use is um restricted to JCU researchers or people who are enrolled at the university or work here um on the I guess that question was probably asked around data access if I can talk about a spare of shares a little bit more um we can provide a wider range of access to data that's exposed by that system so our spare of shares is um hooked into an LDAP that's managed by QCIF so for those of you don't know QCIF manage Chris cloud so just you work closely with QCIF and um they have a portal anyone who is a member of the AAF can log on to Chris cloud and in there in um Chris cloud credentials and provided they're then given access to an allocation they can then access they can log on to our shares machine or the one based in Brisbane if they storage in there and access the data that way so via shares we can give people access from outside the university but QCIF also have a mechanism that they can provide access for people who are overseas as well they can create accounts in there okay um and another one for Jay do you also have SSH scp type access to the a spare of shares or is it web only good question um a spare of shares is web only the infrastructure underneath is it is possible to get I get scp access to um that storage we usually do it by um mounting it on um let's say the HPC for instance that storage and then it can be accessed that way um we haven't actually exposed the there are two servers that manage the shares infrastructure sit behind shares it is possible to expose them using that but it has not been done I can add to that too from a ticket perspective we have a interactive interactive box that's attached to our storage and so that's how they can get scp SSH access to download and they can run things technical tools like screen or whatever so they've got a sequence file they want to download and it takes a while they can just set it going and come back later so that's that's been taken up quite well now this one's back for Vicky Vicky it says um is the crate owned by an individual or a project from the interface I would guess an individual in your own cloud um it's it is an individual and um with it as you know with Antler you can actually white people share the crate so we'll share the um the environment so multiple people can be access to and then Jay you mentioned media flux have you been able to get this operational that's okay okay we've spent a lot of time working on media flux um and I'll mention that our main focus has been on the portals functionality we've found so far that the presentability of those portals isn't very good with regards to being able to customize CSS but we're about to well we've been working closely with them there I think development's about to start very soon that will allow us to give have full control on CSS inside those portals to expose the data um I do have a couple of data sets in media flux but I'd say um I guess watch this space that's all I can say um we do I think it has great potential but it needs you need some developer resources to spend more time on it and that's that will probably be me okay here's a question for all the speakers it says what sort of processes or services do you envisage developing on top of this storage service it was a question around data deletion would data curation be a process or service to build on top so perhaps Vicki if you wanted to start that can you just repeat the question for me please sure it says for all speakers what sort of processes or services do you envisage developing on top of this storage service there was a question around data deletion would data curation be a process or service to build on top yeah that's a hard one maybe easier for for an data man so we haven't really had any further discussion around what we might build on top of this so I'm not really going to comment on that I don't have anything concrete to to actually say but in my mind you know what I would like to see in terms of what we already have here and is that if I'm looking into the future and I've got a crystal ball and that's it's less than five years but it's more than two I would just like to see a lot of the processes around the the point of data storage and publication of absolutely streamlined and that there's less involvement and and by individuals and people and it's a lot more automated so that's what I would like to see so hopefully that's a sort of service that I think that we should sort of be able to put more time and effort into processes and things that actually automate and take ourselves out of the way of the research so they're more in control and so that's really what I would like but that's really probably not answering the question in the way that was asked of course thanks Vicki how about you jade interesting question I guess um we're probably not there yet here at jcu for instance like in our red box we're capturing time frames in which people want data to be retained for but we're not actually any of that at the moment as far as maybe to answer the data creation side of things when particularly with some records submitted via red box our librarian is reviewing the records but then we also have a look at inside let's say the spreadsheets and things to see if they're columns and neatly labeled and that people can understand the data that's in there for external use but yeah I think we're not quite there yet either with regards to those sorts of issues okay Chris yeah we're sort of similar to that most of the energy has been invested in as I was saying earlier making people aware that the service is there and how it's and how we would advise for that discipline that they could use it I think it'd be luxurious position to be and to focus on the latter that the question sort of talks to and in terms of curation so really the the direction I've been providing there is saying well you need to you need to be working with best practice in your discipline so if you're unaware of that then we can we can work with you to come up with something and propose that which is a preservation and that is a deep and like you could do a PhD on that at the moment we're really telling people stick with common denominators so don't go to if you're going with a spoke you need to think about the environments that you would potentially need to access that in five years time and it is really rapidly changing so if you're choosing a vendor with your data analysis or capture that potentially I don't know may go bust or the technology may change three or four iterations you may not actually be able to use that in the future so that's something you need to consider we haven't really this is a lot of time engines in that yeah we just just about out of time here so I'm going to have one last very very small question which is for all three of them students especially PhD students access these services to store their research data yes Jay's going yes Vicki's going yes and Kristen's going yes fantastic wonderful way to finish fall back to you thank you those are wonderful questions and thank you for all the panel speakers who who provide the insight and for the next experience I has been very thought-provoking certainly