 Good morning, everybody. My name is Bill Mishner. I'm the PI and Project Director for New Mexico EBSCOR. And we scheduled an hour and a half for this, but I think it'll probably take about an hour or less. That's been the trend thus far, but we will certainly take questions throughout. The reason for this is there have been some big changes, with respect to federal agencies in terms of how they are treating and data products associated with funded U.S. research. And on March 27th, excuse me, my allergies are kicking in here, on March 27th, we received a Dear Colleague Letter from the National Science Foundation, and that's available publicly. It's NSF 20-068, if any of you want to go read it yourself. But it's an open science for research data, Dear Colleague Letter. And this indicates, again, some major changes within the NSF, and likely other federal agencies will be jumping on board as well. But let me just read these parts of three paragraphs to sort of set the stage for today's webinar. So starting here, in alignment with the benefits of open science, NSF is undertaking an expansion of its public access repository, NSF PAR, to include metadata records about the research data that supports the journal and jury conference proceeding manuscripts, resulting from NSF funded research. The metadata records about the research data will contain sufficient information to allow for data discovery and access determination to be made, but not necessarily all the metadata required for reuse of the research data. Research data will have a digital object identifier that was assigned to it prior to being reported to NSF. Research data will not reside in NSF PAR, but will instead reside in a repository data center or data portal managed by an organization that is committed to ensuring the availability of the data over time. The anticipated location of research data associated with publication, if known, can be identified in the data management plan and budget in the proposal. Research data in support of a publication are one, the data necessary to confirm the validity of the scientific result reported in the publication, and two, the data described by the publication, or three, as specified by the journal or conference proceeding. Complimenting the publication, the metadata record of our research data in support of a publication will, as does the publication, become part of the public record on the NSF website, the scientific contributions of an award. So, this is in process of being implemented at NSF. Initially, things are mandatory, but over the next few weeks, we anticipate that we will see some very hard and fast rules coming down from the NSF. And the groups or projects that are most likely to be evaluated first are the large projects like EPSCOR projects, which are $20 million in size. So, we need to essentially be getting our data products, and these are data underlying the research findings and again peer reviewed articles or juried conference proceeding manuscripts. And these all data products all need to have DOIs, and they need to be deposited in a community recognized repository. Fortunately, we have the capacity to provide all of that to all of you, students, yourselves, and anyone associated with a project, we can provide a DOI and we can archive or preserve the key data that are underlying. Manuscripts and conference proceedings in a community based repository. So what we're going to do today is first, Carl Benedict and John Wheeler will be going over some of the tools and services that can be provided in order to expedite and I think make it quite painless and fairly easy to do this. And then secondly, we will hear from Deanna Dugas at New Mexico State University about the high performance computing capabilities that we have available through EPSCOR and we can provide again easy access to those high performance computing facilities as well as consulting and other services that might be necessary to translate code or get your jobs up and running on the new hardware that we purchase as part of EPSCOR. So without further ado, I'm going to go on mute. I probably cut my video off and turn it over to Carl and John and then after which, Deanna will take over and cover the high performance computing. So I encourage you to stay throughout the entire meeting and we'll take questions throughout and as well as at the end. So thank you. And it's all yours Carl. Great. Hopefully you can see my screen. And I wanted to start out highlighting the information you can go to anytime as a refresher for what we're going to be talking about today. And it's the two critical items that relate to data management and both our plan for how we're going to manage document and share the data products that are coming out of the project. That's our data management plan. And then also a reference document that John and I produced that provides a more detailed step by step process for getting registered with and submitting data sets into the Dryad repository as one part of our overall data preservation and sharing plan. So I was going to start out here on the EBSCOR site and highlight that if you go into the resources and then team member resources on the page that will come up. You can scroll down to the section on high performance computing and data management. And in that section you'll find the link to NMSU's web page that Diana will be highlighting the components of in terms of getting set up and being able to use the high performance computing resource that we have. But in the context of what John and I will be talking about, there's also a link to the project data management plan that provides the more detailed information about exactly what we committed to NSF in terms of the types of data, the types of documentation we would provide and how what our strategy would be for actually providing access to those data and that documentation, both during and beyond the life of the project. And then there's a second document, the data management workflow that highlights how to get set up and how to work through the process of actually submitting data for preservation and discovery and access in the platform that we have for public access, which is Dryad, one of potentially any number of repositories that would be appropriate for your data. And if Dryad is less appropriate than say a disciplinary repository that is specific to the type of data you're generating, we can work with you on identifying those additional disciplinary repositories and work with you on being able to get your data into those repositories as well. Today we'll be focusing on Dryad and the workflow for getting content into the Dryad repository as the one that we have direct access to through the support of the EPSCOR program. Having said that, I'm going to now switch over to the Dryad homepage. So Dryad is a data repository that is managed by and for a broader community of organizations that are focused on long term data discovery, access and preservation. Through our funds from the EPSCOR project, we have been able to set up an institutional membership with Dryad that EPSCOR is funding over the course of the EPSCOR project and we're working at UNM to ultimately identify an additional strategy for being able to continue that institutional membership beyond the life of the EPSCOR project. And part of that is actually demonstrating the value and impact of the data products that are going into the repository. The Dryad institutional repository model is based on affiliation with individual institutions. So in this case, the institutional membership is through the University of New Mexico. But since the data one project is essentially administered from the University of New Mexico, all of our EPSCOR researchers have access to the benefits of this institution of membership. The most significant one being that we are able to add data sets to the repository without paying an additional deposit fee. Folks that do not have an institutional membership or who are not working through an existing publisher agreement with Dryad actually have to pay to deposit their data into Dryad, but through our membership that fee is waived. The one caveat is that since this institutional membership is linked to the University of New Mexico, there are two paths for being able to get content into the repository. If you are a UNM staff, student or faculty member, you can log in to Dryad using your UNM email address credentials and you can potentially add your own content into Dryad by virtue of those UNM credentials. If you are not a student staff or faculty member at UNM, you would need to work with John or myself to work with you to get content deposited into Dryad. It's just, it's an artifact of the way they have structured their institutional membership and how you can log in. The login process is very simple and the preferred method is through an ORCID. And if you don't already have an ORCID which is a unique identifier for each of us as a researcher that allows us to link that identifier to the data sets that we produce, the papers that we publish, the grants that we submit and receive. It allows for unambiguous identification of us as individual contributors and developers of products so we can get the credit that we deserve for all of that work and it's easier to actually track down and categorize and collect the products that we're generating. So if you can log in, in this case, I'm just going to click on the login or create your ORCID ID. And I've already gone through this process so I have, I've already linked my ORCID account, in this case it's a personal account. And my email address and I've already established a password in ORCID and they're using ORCID to authenticate me and I'm saying sign into ORCID. And now I'm logged into the system and I can see then any data sets that I have in progress in addition to data sets that I've submitted. I'm going to open up one of these existing data sets that I've started in one of our other workshops to show you an example of what the submission workflow looks like in terms of the information that is needed to essentially document a data set and then upload it into Dryad for review by the Dryad curators before it is actually published. The information that is needed is provided on this first describe your data set screen where you first can link it to either a manuscript that's in progress an article that has already been published. So if you have data that are associated with existing publications, you can absolutely add those to the system as well. Or if it is being submitted for some other reason where it isn't necessarily associated with a publication now, but you still want or need to be able to make it publicly accessible and obtain a DOI, a digital object identifier and a stable citation for it. You can still upload your data. It's not required that it be associated with a publication. You need to then provide some basic information about the data set, giving it a descriptive title, providing information about the authorship who contributed to the development of that data set, preferably again with associated orchids. Dryad requires an orchid for the first author. Orchids are optional for the additional authors, but as I was saying earlier, there are significant benefits for you to establish and maintain your orchid and use that identifier wherever you can so that you can continue to essentially build and establish your record that's easy to track and find. There's also then after the authorship, the addition of a descriptive abstract for your data set. This is, you know, just as you would with a journal article. This is the first thing that folks that are potentially going to be considering the reuse of your data set. This is the first thing that they're going to look at in terms of information about the data set or the collection of data that you're uploading. You then have some additional descriptive fields for the data set, including a set of keywords, and we can work with you on identifying an appropriate source for those keywords because we would preferably like to draw those keywords from an existing collection of terms that are appropriate for your discipline as terms that would likely be used by other researchers trying to find your data set. A narrative description of the methods that were used to generate the data set, how are they collected, how are they processed, that sort of information. There's an additional field for usage notes. This relates more to the effective use of the data. So what is the structure? What are the formats? What is the data dictionary? Essentially, the description of, say, if you're working with a tabular data set, what are the column names and what values do they contain? What are the units? This is critical information for actually transitioning from your data being findable to actually be reusable. And John will talk in just a minute about fair, findable, accessible, interoperable, and reusable as objectives that we are aiming for. But it's this documentation that we're providing here that is critical for hitting all of those points in documenting the data sets. There's a critical element here in adding information about related to funding. We want to make sure, coming back to the Dear Colleague letter that Bill was just discussing, that the project gets credit for the data sets that have been created and shared. And by linking information about our NSF award, that it was from NSF and the award number, that streamlines the process of being able to capture and report any of the data sets that are being produced by the project. If we're adding data sets to other repositories, we would be doing the same thing, making sure that we are capturing information about those data sets for our own records, but also adding to the data sets in those other repositories, information about the granting organization, as that is one way that essentially the statistics for compliance with the requirements for data sharing get developed. And then there is also this section being able to capture information about related works. So this may be relationships, citation relationships with other data sets that may have been used to feed into and contributed to the generation of a data set. There may be additional publications. There may be a data publication that has been published in a journal related to this data set as well. It's in this relationship section that you can add items that are related. You can see there are many different ways to characterize what that relationship is. Once you've provided all of this information, you can then go through the process and I won't go through that here of uploading one or more data sets that are associated with the submission. And then ultimately you submit the data set for review by the curators at Dryad. This is hopefully following work with us, John and myself, to make sure that your data are already well documented, that the documentation you've produced is complete and well aligned with your data that are also well structured and organized. So that we again maximize that potential impact of your data. As soon as you actually have submitted it for that review process, you get a preliminary DOI or a provisional DOI. So you're then able to know that once it has gone through that review process and is published, you will know what the identifier for that data set is. So you can start referring to it in say a draft publication that you're submitting for review as an increasing number of publishers are requiring that you demonstrate that the data are in a repository somewhere with an associated identifier like a DOI. There's also a link once you have essentially initiated that submission process that you can share with reviewers or publishers so that there can be preliminary access to the data set while it's still going through review for reviewers of a publication. So there is essentially an early access opportunity for data sets if you need that before they even go public. But this is basically the simplified workflow that we have through the website. John and I are also working on ways to streamline sort of bulk processing and bulk uploads into the system for large collections of data or multiple data sets as part of a larger research activity. And this is where, again, I want to come back to contact us early in your research work so that we can help develop a strategy for being able to cross the finish line with data sets going into the repository as smoothly and easily as possible in terms of having the documentation at hand, having it well structured and organized, having the data well organized and understandable and reusable. As they go through the research process so that this last stage of being able to get it into our preservation and sharing system is a very smooth and easy process. And I'll take any questions while I change over and make John the host so that he can continue with some additional information about Dryad. Okay, John, you're now the host. Great, thank you. Okay, and so what I want to do is show a little more about how things look after a data set has been submitted to Dryad and use an existing published data set to demonstrate some of the sort of data friendly features that Dryad offers as repository. Can everybody see my screen. Yeah. Okay, great. So, one of the things, sorry, okay. And so, yeah, Dryad is a data repository which gives it some features that distinguish it from, for example, a document oriented repository like our institutional repository. And that includes as part of the upload process and part of the data set creation process there's two ways that you can import your data into Dryad. There's a desktop method, which allows you to upload up to 10 gigabytes per data set. If you have large data sets, there is sort of a server to server option that allows you to upload up to 300 gigabytes. So there's a lot of variability and a lot of flexibility and we can definitely assist researchers in setting up the server to server communication and system that's necessary in order to upload those larger data sets into Dryad. But a couple of things here, there's the orchids that Carl was talking about. Again, that creates a link between the researchers and the data, an ambiguous link. We do have some colleagues on this data set that we published who do not have orchids or didn't provide their orchids. So as Carl noted, it's really only required for the first author. But a couple of things I want to show is that this data set is fairly large-ish. It's about six gigabytes and using the upload feature from the desktop. It only took a few minutes and I was uploading, I think, 10 to 12 files at a time. So the system has a really robust backend for getting the data into Dryad. It's fast. It's pretty painless. There's version support where we had two versions of this data set, one in November and one, an update that was released in January. And being able to update the data files and to add new data files, let us create a second version of this data set without changing the DOI. So that's a great feature because if you've shared that DOI with your colleagues, you've shared it with publishers, you can make changes to the data set and the DOI itself doesn't change. So you don't have to go back and reshare that. There's also this data publication that gets created, which is sort of a best practice on the data curation side of things and where it's valuable is, first of all, that PDF is something that can be added to dossiers and so forth as you're moving through promotion and tenure. But also, to give you a look at what our data publication looks like for this data set, this is something that makes the way Google Scholar works and the Google search index works. A lot of the indexing is done on the PDFs rather than the page level metadata. And so having this document here that includes all of the metadata and links to a data set actually makes your data more discoverable. So that is a nice feature that Dryad offers. And again, metrics we can see right away how often the data set has been accessed and downloaded. And to come back to something Carl was saying, we do have this descriptive metadata that's required when you create a data submission, but in addition to come back to this idea of fair data to make data findable accessible, interoperable and reusable. It's most helpful for your colleagues and other professionals in the field to have access to this information about methods and how the data were collected. Any limitations on the data, data processing steps that that you went through in order to get the data from the raw state into the published state. We have some definition of terms here how we created CSV files from it. As Carl mentioned, a data dictionary for the tabular data that are in the published data set is very helpful and important for the published data and to help people use it. And what we'll see in the metadata template that I'm going to share in a minute. So we added this to our data set we basically took our existing documentation and just copied and pasted it into the Dryad data submission. And so with that, moving on to the template. If you contact us at rds at unm.edu, what we can do to help get your data into Dryad is set up a shared folder on one drive. And then that shared folder will have a spreadsheet that follows this metadata template that we're looking at here, and it includes all of the required fields and the optional fields that Carl described in the data set submission process for Dryad. So for example, we have the title is required, the author one, and all that required information and we see that author one also there is an orchid required for that person. The additional author is author to author three this information name, email and institution are required if there is a second author is required if there is a third author so name, email and institution are required for every author. If you use this template you'll see it looks just like this in the spreadsheet version and you can simply add additional author information as needed by by adding the corresponding rows. Third, a space is provided for keywords again we recommend using controlled vocabulary terms and that's something that Carl and I can assist with for methods and usage notes. Do please feel free to upload relevant documents as appropriate like I said, in the example that I just showed, we had all that documentation and project documentation and when we went to publish the data it was just a matter of copying and pasting it. So if you have lab notebooks if you have SOPs, any information that's relevant to how the data are collected and process. If you share that with us we can help you get that in an appropriate useful fashion into the data set record. Funding information again if you have additional funding it's recommended that you include that information, but by default we have added the current score grant number to the template. So we have a space for the additional information about related works and location information that Carl described. So do please feel free to reach out to us at rds at unm.edu will share that email in the chat to help get this process going and I can also at this time take any questions while I hand it off to Diana. I already pasted the rds email into the chat. Towards the top that's probably scrolled off a little bit now, as there are some additional questions. Excellent. And the other you're the host. Thank you. Did everyone get their questions answered in the chat. All right, hearing crickets I'm going to assume yes. Good morning everybody. Good morning. Welcome back to the page that I need to actually be on. I am one of the information and communication technologies directors down at New Mexico State University. I am also one of the main CI contacts for the entire EPSCOR project. And today I wanted to spend just a few minutes talking to you guys about some of the computational resources that the EPSCOR project helps support. When the authors of the grant sat down, they realized that one of the things that would be necessary would be both computational resources and storage resources. So they did an incredible job and got you lots and lots of hardware that you can play with. So that you guys just have a sense of it, the storage that you have access to is over 700 terabytes worth of open space right now. That is oodles and oodles and oodles of storage space. And we know that you guys are producing data, right. We see lots of publications. So therefore, there's got to be data happening. The storage system is backed up. So it makes a very, very convenient area to place your data. Because it means that unlike that hard drive that is stuck underneath your desk in the office that you're not allowed to go back to for who knows how long. So this data is easily accessible to you. And again, it's backed up. So when you get back in the office and you try to start up that hard drive and it doesn't start that backed up data that you had may or may not be reclaimable the HPC attached storage, however, is in addition to being backed up. I just said some very important words. It is HPC attached, which means that if your data is large if you're planning on doing any sort of data analysis, it's already ready for you to use with the HPC. So you don't have to wait for your data to transfer. You don't have to worry about how long things are going to take whether or not there's enough space, their space, and it's ready and waiting for you. So if you do have analysis that you want to do what's available and how do you use it. So if you go over here. So right now I'm on the HPC that NMSC.edu webpage. It's an incredibly easy URL to remember. I know that a wonderful job of showing you how to locate it on the upscore website, if you happen to forget, but simply Googling HPC and NMSC will also take you to this page. If you go under discovery and discovery details. It has information about the shape and size of our cluster. You scroll down a little bit. You've got these three lines here that all talk about the upscore hardware. So we have four dual GPU nodes, which means that there are two NVIDIA Tesla V 100 GPUs on each one of those nodes. So that's a total of eight. Those GPU nodes can also simply be used as CPU nodes. You don't have to have the ability to program on a GPU nodes. Your code doesn't have to be able to utilize GPU nodes in order to be run on this equipment. There are also two high memory nodes. Each one of these has three terabytes of memory. So if you have a large data set, for example, that needs to be in memory in order to be processed. You can use these nodes for that. It also works well if you have something that doesn't span well across nodes, but needs a lot of memory. And so you can use these as well. Okay. On this website, we also have a little upscore tab. We click on that. We've tried to make it relatively easy for people to get access to the system. The website, this web page has a description of the resources available again. And then it tells you how to get an account. So you can either click on this link right here, or you can go under requests account request. So both of these will take you to the same location. For simplicity sake, we're going to go ahead and walk through this so that you know what to expect if and when you decide that you need to use the HPC. The first thing, of course, name first and last university email and department. If this is a non NMSU email, please put it in as long as it's a university email. We really don't want to be communicating with Yahoo accounts, Gmail accounts. But at NMT, at NMU or at UNM, at NMSU, all of those are perfectly valid. So there's something about the sizing of my screen. So I'm going to apologize, but these buttons seem to be sitting on top of the words. It shouldn't be this way when you actually look at it on yours. There's something about relaying it through zoom that's causing it to shift a little. So next we want to know your affiliation with NMSU. If you are an NMSU student faculty or staff. Then notify us that you are an EPSCOR person so that you get access to the EPSCOR partition. If you are not an NMSU affiliate, right, or if you are an EPSCOR person, go ahead and click on this button. And what you noticed is that we had this line sort of pop out. So the HPC is behind a firewall on the NMSU campus. So in order to access it from off campus, you have to go through the VPN. This is a standard and easy thing for anyone that has an NMSU email. But if you don't, we need you to fill out this HPC VPN form so that we can get your, right, you access to the HPC. We're going to take a little offshoot here and look at what this VPN form looks like. This is the requester, right, if I'm at UNM, I fill out section one. So it's relatively standard name, employer, email address for the reason for temporary access, simply stating something like I'm a part of EPSCOR, or EPSCOR analysis, anything with the word EPSCOR in it is fine. We need a dissertation on what it exactly is that you're going to be using the system for. You're going to see this language again in a little bit when we finish filling out the account request, but we have a data use agreement. And this is incredibly important because the NMSU super computing cluster is not a secure system. So any data that is sensitive in any way shape or form should not be found on the system. However, we do not go into your home, right, we don't go into your directories. We don't look at your home directory, your scratch directory. We don't look in your files to try to figure out whether or not these files contain anything that is sensitive. It's on you to make sure that your files obey this particular language. So if, if it is something that contains sensitive data, please reach out to us. We'll try to find you a different system that you can run that is secure. But this language means that you are the one responsible for ensuring that your files do not contain any sensitive information. Section 2.1, the non-NMSU sponsor information. So as a part of EPSCOR, I have talked with Ann Yackel and she is going to be the person who fills in this particular section out. So once you filled out the first part and signed and dated the user agreement part of it, simply send this form off to Ann. She will verify that you are somebody who's listed as an EPSCOR affiliate. And then she will fill out this section and go ahead and send it on to me. Because we are all EPSCOR, I then am the NMSU sponsor, and I will also fill out this last section. So once it gets over to me, it should go incredibly fast. The onus really is on you to get it to Ann as quickly as possible so she can turn around and send it back to me. So we filled out our HPC VPN form. I've gone ahead and sent it off to Ann, and now I'm back to filling out my account request. So again, under research description, we don't need a dissertation, simply putting EPSCOR research works. Next we want to ask you a couple of questions. So have you ever used Linux or Unix? Yes or no? If I click yes. The next question that pops up is have you ever used a high performance compute cluster? If I again answer yes, it asks me a little bit more about the different types of information that I'm going to need in order to use the cluster. So again, yes, answering no doesn't matter. It's not a test for you. It's not to write this information won't be used to determine whether or not you will gain access. You will gain access. These questions are simply there so that we know sort of where your base level of knowledge is, and what it is that we need to make sure to convey to you before you become a user on the system. Before anyone is able to actually log into the system, we request that you either take the Canvas course or have a one on one in this case a Zoom meeting with one of the HPC team members. From there, we're going to go ahead and teach you everything that you need to know. So if you're not even sure what Linux or Unix are, we can help. We will start by telling you what what those words mean, how to navigate in that particular system and build upon those skills to teach you how to actually use the system itself, how to submit jobs, how to get software loaded into your environment, how to request additional software. We're here to help. So, regardless of whether or not you know what any of these words mean, by the time you're done with onboarding, you will know and hopefully feel comfortable with it. At the same time, we are not going anywhere. The students are still very active in helping the NMSU community. The HPC admins and I are part of the HPC team as well. We have no plans ongoing anywhere. And so once you're done with onboarding, if you realize that you've forgotten something, if you realize that you need help writing a script, if you realize that you have a script that runs great on your laptop desktop but suddenly doesn't seem to run on the HPC, if you want to optimize something, if you want to try to convert your regular processing analysis script to something that can be run on a GPU. Right, we are here for all of that. We have multiple ways that you can contact us, which I'll go over in a little bit. But if you have any questions, there is no question too big, too small for us to answer. We do it all day and we're quite, we'd rather enjoy it. So continuing to scroll down, we have the data use agreement. Again, this should be very familiar. We just filled it out or we just agree to this on the VPN form. But again, it says I will not store or use sensitive regulated data on the system. Something that everybody should probably do once a week at this point is verify that you are not a robot. I don't know about you, but I'm not sleeping so great and the days are starting to blur. So sometimes it's nice to have this little checkbox tell me that I am not a robot. So once that's done, you should be contacted within 24 hours by somebody on the HPC team to remind you what your next steps are. So if you are somebody who is outside of NMSU will try to remind you to get that HPC VPN form filled out and sent off to Anne. I want to contact you to see if you want to schedule a sort of in person is in person as we get nowadays, meeting with an HPC member for onboarding or if you'd like to be added to the canvas course. You need us, you can go under help contact us and we have a form that's available here for you to fill out. However, if you go under help and office hours. We also have the HPC team email listed right there. If you email that HPC dash team at nmsu.edu, it will reach for graduate students to HPC admins and myself. So somebody is generally available to help within a short time frame. One thing that I sort of skimmed over entirely here is why should you be interested in this. Especially with the COVID-19 keeping everybody out of offices right out of labs. Most people are at home with systems that are a little, if not underpowered at least less powerful than what they had while they were on campus. The HPC is a very, very powerful resource. So if you're finding yourself sitting at home and the analysis that you used to be able to run in the lab that would take an hour is now taking five hours on your laptop. Come talk to us, we can help. If you're finding yourself having difficulty loading all of the data that you need to analyze onto your laptop right your hard drive is getting full contact us, we can help. It's really just a giant computer that you log into in order to run your data analysis. So, again, even if all you've ever done is run it on your laptop and you have no idea how you could possibly take advantage of the HPC contact us. We'd love to help get you onto the system and using it well. Another thing that we are currently working on. So, UNM and NMSU is working on a shared storage space where if files are dropped within this particular directory, they will be mirrored between UNM and NMSU. So if you happen to be a UNM researcher, and those systems are more familiar to you, you're more comfortable using them, there's time available on them. Right, feel free to move your data into that particular directory, it will be mirrored and you can easily bounce back and forth between the car scene resources and the NMSU resources. The entire goal of this side of the project is to make your lives easier, and to support you in doing the analysis that you need. So, the over 700 terabytes of storage that we have, right, that's a great place to keep your data, keep your data files, keep all of your metadata, keep your scripts. Keep everything that hopefully you've already talked to Carl and John about how to get those things into the repository and what information you need to be tracking for the repository. But even if you wait until the end to talk to them, this is a great place for you to keep that data. It will be up, it will be running, it will be supported and it will be easily accessible by everyone inside of the team. That being said, everyone does have their own independent space, which means I can't necessarily see what Carl is doing on the HPC. However, we have project space where he and I can collaborate. So there's, there's a million different ways of using the system. If you are at all interested, please contact us, let us know. And again, we're here to help. With that, I'm going to stop sharing and see if anyone has any questions. Good morning. This is Kaitana da Silva from New Mexico Tech. I have a very quick question, which is, are these resources available for researchers that are not connected to the specific, this is specific New Mexico app score grant? For example, if I have a student funded on a separate grant, can he or she still use the HPC resources at NMSU? So that's a great question. Usually there has to be an agreement with a researcher on the NMSU campus in order for your student to have access to those resources. If that isn't the case, we do also have regional resources and national resources that are also free for use. If you don't mind, I'm going to put my email address here into the chat. Send me an email. And if you don't have an NMSU researcher that we can put on as a collaboration or as a collaborator, I would be more than happy to give you access to my exceed allocation, which is the national resource. And it's a collection of, I think, six incredibly large HPCs. And so you can see which one of those would be best for you and your student to use. I'm happy to help you try to figure out which one will be best because they have various amounts of software and different amounts of support. Also, the regional supercomputer is called Summit, not to be confused with the top 500 supercomputer. But it's located on the University of Colorado's campus. And we have access to that as well. So I and at least one of the HPC admins are campus champions, so we can help you get access to the exceeded resources. And I and Patrick are a part of the ARMAC Consortium, the Rocky Mountain Advanced Computing Consortium. And so we can help you get access to the regional resources as well. They're available. You just have to reach out. Okay, thank you. And as I've said in chat, the same, the same policy applies for the UNM resources. If you have a UNM researcher you're working with, you can, on the product, we can get you access to the supercomputing systems at UNM and associated storage, which is mirrored as Diana said with Mexico State. And likewise, we can also help you with national resources too. So. And on a different topic, I would extend the comments I was making in the chat regarding access to the dry add system. And our strong recommendation that even if you can or are directly submitting content to either dry add or another repository. Please get in touch with us at rds at unm.edu, because we're also trying to keep track of the various data resources that are being created by the project. And we are also working on making sure that those data are also getting into our preservation system, something that that we didn't highlight earlier, which is essentially a dark archive that is managed specifically for keeping the data safe for long term. And that is designed specifically for making sure that the data remain viable intact, and that over the longer run if necessary, can have formats migrated to supported formats in the future. So please even reach out to us, even if you have already or if you are directly placing your data into another repository, just so we can make sure that we're also getting it preserved through the other part of our sharing and preservation system that we have for the project. Any other questions or comments from folks. Okay, kind of the silver from New Mexico tech here again, I have a couple of questions for you bill, mostly about about about the logistics here so the first one is so dry add is the recommended solution, but not the required one could I could I could I use something else. And the second question is, if we are supposed to also upload papers to this type of, you know, journal articles to this type of databases. How do we deal with, you know, with with the publishers copyright issues and those type of those type of restrictions. Well, with respect to linking papers to the NSF our system I don't think there's any publisher that prevents that from happening. This is a legal requirement and I think all of the publishers have agreed to do that within SF. So those aren't the publications I don't believe are publicly shared. I'm not sure they're under and I think they're under an embargo period. Before they're I think made publicly available. Yeah, I think it's like a year embargo or something like that. Yeah. Yeah. So that should not be a problem at all. And then you can use another repository but that can't be like, you know, your departmental server because that's not a community, you know, acceptable repository it would have to be something that if you go to RE3data.org and search through the community repositories there. Dry it is nice because it's relatively easy to use but there may be some in some cases there are specific repositories for certain types of information like protein data bank, you know, that's a publicly acceptable repository for protein structure data and there are others that clearly might be more applicable for some types of engineering data. So those are the types that we're talking about but not like an in-laboratory server would not be considered a public repository. Does that answer your question, Kakana? Absolutely, thank you. So, but I would encourage everyone to go ahead if you've published a paper that uses data in it anywhere or you have graphics in there that are dependent upon data. Go ahead and make the connections with Carl and John and get those data sets into, you know, a repository if not dry add ASAP. If you're thinking about, you know, if you're working on a paper right now and you've got data, it's reasonable go ahead and make the contacts as well just to let John and Carl know that the data are coming their way very soon. We will soon be reaching out to everyone that has published on EPSCOR and asking about the status of the data that underlie those various publications. So if you don't want us pestering you, go ahead and make those contacts and get the data going into the system. This is a federal requirement. Officers from the Office of Inspector General at NSF, they can come and close down a research program. So for example, someone has a Freedom of Information request and they want to see the data underlying someone's publication that was supported by NSF funds. Those data cannot be produced via mechanisms like we're talking about here, then, you know, the NSF can come in and essentially shut down research for that university. They've not done it yet, but my guess is they will be certainly looking for some scapegoats to do this to down the road. So let's go ahead and be on the positive side of the regs here and get those data in. Again, not all the data necessary to reproduce every single aspect of your research, but again the data should be discoverable and it should allow the data to be potentially reused and so on. So it's not meant to be an onerous and most of the metadata required is really what's in the methods and materials section of your papers anyway. So it should be relatively straightforward to do this. And also take advantage of the computing horsepower we have. We're really blessed to have lots of high performance computing resources in New Mexico that can really benefit our research and we've got some great resources both through the New Mexico State as well as UNN. So we should all be taking advantage of those and feel free to contact any of us in the state office or Carl or John or Deanna if you have any further questions down the road. And unless there's anything else, I thank you all for your attendance today and have a good rest of the weekend and upcoming weekend.