 Well in the interests of time because we have a number of presenters today. I'm gonna go ahead with the introduction to this panel It's my name is Martin Halbert. I'm the NSF science advisor for public access I'm here today Representing NSF together with my colleague Plato Smith. You want to wave your hand back there Plato another program officer at NSF and together we're sort of Central core of the NSF public access initiative this session is We thought that we what we would do with this session to sort of illustrate concretely what the program has has led to while the NSF Program is The sum of all the programmatic agency responses to the national need for publicly funded research outputs to be made publicly accessible And NSF does NSF's public access initiative does a lot of things like we run the Agency public access repository the NSF par We do a variety of things, but today we wanted to feature awardees from the program over over the years some representative and catalytic projects that have been funded so the public access program Supports projects that seek to make the outputs of publicly funded research available to the public as I said over the years The program has funded a hundred and twenty nine projects over the past decade. I looked it up this morning I don't know we we've had it is Beth play Lee in the room anywhere Hey, Beth of another former Program officer in public access program at NSF And at least one of these I think bill you were one of Beth's awardees maybe in Somewhere in the past. I don't know the program has long and storied history. So that's wonderful There the the program makes a lot of awards in any given year This year in 23 we made 10 catalytic awards This was a particular focus to try to catalyze change through key change agent projects I'm not going to go through this list of 10, but we have one of them represented here the smart DMSP project That you'll hear about We wanted in this panel to give you a perspective on the program on the projects that have been awarded over it over the years so we have sort of a mix of Projects like the past program that you're going to hear from that are farther back in the program's history as well as two representative projects from This large program that we competed last year in 2022 that you may have heard about the pharaohs RCN program long acronym mouthful the findable accessible interoperable reusable open science research coordination networks a particular genre of Awards that NSF makes and you will hear about what kinds of activities RCNs NSF RCNs do and with the specific focus on fair data principles and Open science more broadly so with that We're gonna tee it up first with a presentation by one of our Pharaohs projects, so we'll have Kathleen up first Thank you so much Martin and thank you for Including us in this in this project in this panel I'm super excited to be able to be here today representing a broadly cross disciplinary team Working on a project called stem ed plus commons, which we refer to as a fair care OS RCN which I will talk more about as we go We were one of the grantees in that first round of pharaohs RCN funding And we're really seeking to build a network of stem Education researchers as well as a platform to support their work and to help them facilitate more and better Open collaboration among them So our proposal originally referred to the project as Deber plus commons under Erasure here if you go looking for us in the NSF database, you're gonna need that term in order to find the grant but We rapidly figured out as we started the community oriented work in pulling this RCN together That Deber or discipline-based Education research wasn't a term with which a lot of the folks that we were trying to bring together really identified And it didn't adequately address the core goal of the project Which is really to get post secondary education research in stem fields out of their disciplinary silos Right now crucial research is being done within physics education or Bioscience education or math education or so and so on But it isn't really known outside of those communities And so their researchers end up reinventing the wheel on a lot of their projects Research communities are unable to learn from one another to build on one another's results And this ends up resulting in a lot of duplication of effort So one key goal for our project then is to move from these many isolated stem education Education research fields to one complex stem Ed plus community That can really result in better interdisciplinary research Coordination now one challenge in creating this coordination though is that these fields operate with widely Varying and often poorly applied metadata standards In standards for things like data access and reporting principles and so on and as a result Bringing these fields together is not just a matter of building a platform or gathering the people who make up the research Coordination network, but rather requires developing collective agreements and even more importantly researcher buy-in that will help standardize the Dissemination and preservation of their work Moreover because much of the data that's generated by stem education research is qualitative rather than Quantitative in nature and because the researchers who produce those data often feel quite proprietary about it There hasn't to this point been a strong commitment to public access to that that particular product of Research on the existing platforms that disseminate the products of research within stem education fields tend to focus on publications But education researchers produce a wide range of other kinds of research outputs including data, of course But also including Project materials that would benefit from circulation and that our RCN needs to be able to support Even more living up to the pharaoh's label for our RCN is going to require Significant community engagement in order for instance to develop the collective norms and assumptions Regarding means of ensuring that the data that are shared are shared with the highest Concern for the privacy and security of the communities involved in the research projects Developing those norms will require not just a commitment to making data findable accessible Interoperable and reusable, but will also require a commitment to the principles of care as well All right ensuring that the data shared are used for the collective benefit of the communities that are being studied That research subjects including students have the authority to control how data about them are being used That researchers exercise appropriate responsibility in their engagements with and support for the communities with which they work and The community ethics are brought to bear in ways that limit harm and maximize benefit So in considering how to build such a network and how to establish the shared values and principles That it requires the project leads within the stem ed fields turned to a seemingly unlikely partner of humanity's commons Now humanity's commons was originally launched in 2016 by a team at the modern language association In order to provide a means for scholars and practitioners across Humanities fields to share their work with one another to develop new collaborations and more We built this network specifically for the humanities in large part because our fields were technologically underserved While there were a range of platforms that folks on our end of campus could use to share their work Few of those researchers Recognized themselves as having a place on those platforms and so their communities just simply didn't grow but by creating a network of networks that provided a range of professional organizations and institutions in the humanities with spaces to cultivate Interactions among their members Alongside an open access free of charge space that any interested person could join Humanities commons developed a strong and growing interdisciplinary community We currently have somewhere in the vicinity of 52,000 members worldwide Who are using our network to build free and add free WordPress websites To develop rich professional profiles To deposit and share work through the commons repository and more Now over the last few years We started noticing a growing number of commons users who were not in the humanities at all Right, but we're nonetheless attracted to our program into to our platform Despite being social scientists or even STEM researchers because they recognize that our combination of the Repository with the social networking capabilities that the commons provides Allows them to develop their own communities on the platform And to ensure that that the work that they most want to get out to the world is able to reach them And moreover because we're a scholar-built nonprofit platform working toward transparency in all of our technological Financial and governance processes. We provide a trusted space for researchers who want to Work in public but are unhappy with the extractive nature of most corporate-owned Free-of-charge platforms So the humanities commons team has partnered with this collective of STEM Education researchers in order to develop STEM at plus commons This is a diagram with a whole lot of very tiny text on it, which I would be happy to share If any of you are interested this was a diagram that we used in our Proposal to talk about the affordances of the network that are currently that that were already in place And then the affordances that we were going to need to develop as part of this project The STEM at plus commons is going to provide all of the features and functionality of the existing commons And it will support the new communities processes of developing its own standards and practices But because those researchers coming to us from STEM fields are likely to bring with them a range of different assumptions and requirements About disseminating their work that our humanities based members thus far have had We're in the process of making some really significant transformations to the the network stack itself We are shifting from our present fedora based repository stack To an in Venio RDM stack, which is going to allow us to ingest more and different kinds of Information and data sets and other products It's going to allow us to to Support the deposit and sharing of complex Multi-file projects. It's going to allow those projects to be versioned. It's going to enable integration with things like github And more and all of this will create opportunities for new kinds of engagement in exchange for all of the commons members Including connection to open peer review platforms and the development of new kinds of overlay publications, which we hope will be coming However, while the technical affordances of a research platform like ours matter enormously Building and sustaining its social infrastructure is in many ways more challenging and more important to the project's success So this first iteration of the new commons repository for instance will soft launch in late spring But the stem ed plus community engagement team is currently conducting interviews and is doing focus groups and more kinds of outreach with leaders in Education research to find out more about their needs to check in with them on the process of development that we're working through and To develop the collective understanding That's necessary for more open data sharing as well as for developing a set of shared best practices around discoverability and privacy and so on Now as for humanities commons Santee if you're in here, this is the diagram that I was talking about at lunch We're in the process of rebranding right now in a way that will allow us to be more clearly seen as a home for interdisciplinary work For a wide range of organizations institutions and research networks that are grounded in openness Transparency and community governance that want to join and support the building of this network and in addition to developing that community We're also working on re-architecting the technical Core of the network working toward a more federated distributed network of networks within the commons That use the activity pub protocol to support cross instance Communication so we're really grateful to the NSF and to the STEM education community for giving us this opportunity To really rethink how our network can function within a broadly interdisciplinary and evolving Research space, so thank you very much our next presenter Is our first presenter from John's of the two projects unrelated at John's Hopkins And let me see we get Here we go hello everyone as Martin mentioned I'm Bill Brandon that John's Hopkins University And I'm going to be speaking about the public access submission system or pass which has a purpose really of Reducing the burden on researchers as they seek to comply with policies that are either Imposed by funders or imposed by their institutions around public access open access and the project really seeks to do this by simplifying that process of depositing content and it simplifies this process by defining what a workflow should be and Doing the work in the background to capture all the information that's necessary to kind of step into that workflow so journals grants publications metadata Policies other things as well and pulls that together to place it in front of researchers to allow them to make selections Input the appropriate information only that's what's necessary and move on with the process as simply as possible So pass provides a workflow that is a step-by-step Process that gives researchers all of the information they need right when they need it and allows for deposit into multiple Repository systems so agency repositories institution repositories and Seeks to track the outcomes of those in order to enable Researchers to see where they are in that process and what the outcome is and really the intention here is for the system to be Extensible to allow for a greater number of deposit types a greater number of deposit systems a greater number of Input for data so that a broad range of institutions can come to this application and say this is going to fit within the context of my University and provide value to the the researchers that are doing great work within our context so the background here is that pass started as a project a collaboration between Jhu Harvard and MIT back in 2019 and we were really kick-started and large part by an NSF eager award at that time that allowed us to really investigate what are the mechanisms that are required to handle deposit in a consistent and applicable way to to all of these different Systems applications that we where we might want to push data So the result of that award was a specification called a bag end the bag it enabled deposit And yes, some of the folks on the group were Tolkien fans and And the purpose of that was to kind of separate the mechanism for handling the deposit itself and the packaging for the For that deposit so this set us up for having the conversation about if we need to move Manuscript data from the researchers hands into a repository system What is the the structure for that? How do we capture it and pass it and how how do we actually take those steps to go from? Point a to point B and ensure that it arrives in the right place at the right time So over the last two years we've been embarking on a significant modernization effort of the pass application In the last slide I mentioned that that kind of first round happened ending in 2021 and then there was this thing called COVID and And then in 2022 we picked this back up and said this is something that we want to see utilized Not just at Hopkins where it was deployed at the time But by a number of large of other institutions that have the same problem We keep hearing from others that this is something that is a challenge across the board and how do we Set up an application so that we're in a place that others can use it and we can move forward with this for for Hopkins as well So I'm not going to read through all of the things on this slide But I a few things to point out there there was a significant amount of work done in replacing the the back end as Recognition that the past system is not intending to be a repository himself It really is intending to be a workflow system that allows for moving content from one place to another With repositories on the back end that have their own expectations So removing anything that was unnecessary to simplify the process simplify the system make this something that is as extensible as possible so that we can move in a direction of allowing other institutions to plug in their repositories plug in the Datasets and the data flows and the systems that they already have in place so that we can move this into a broadly adopted system One thing I will also call out here is as part of this process We adopted both some new open source processes past was has always been open source But we are now collaborating with the eclipse foundation to really move into a again modern context of open source ensuring that we have solid governance and and the the kinds of structure that's necessary to work with a collaborative group a Second of three major initiatives over the last three years for the past project the first being Significant development effort a second being moving into production with this new version of the past application So we did a reboot of the application deployment at Johns Hopkins working with a Large number of folks in a working group in in the libraries to build out a new website a new user guide a new FAQ walk-through videos All the things you would expect when a new application is being deployed in the university context But I don't call this out to say look at the fantastic things we have at Hopkins It's more about past is being set up to have a series of resources That are available to be used both the software as as well as these other resources That allow people to see how can I use this? Where's the value? How do I connect to this within my own institution? So these are all resources that are expected to be utilized repurposed for the purpose of Of using pass in other institutions if you'd like to see what that looks like you can go to past at jhu that you do you and so as one example of the Resources that was created as we were working towards Talking about pass within the Hopkins context with it was this slide that allows us to just walk through some of the values and benefits That pass brings to the table for the institutional researchers that are looking to handle and submit their manuscripts through pass into either our institutional repository or The NIH manuscript system which late leads to PubMed this the third part of the the three initiatives that occurred over the past year was Working together with a series of other institutions that are like-minded in seeing Pass be utilized within their institution for the submission of these of this Manuscript data into repository systems. So the list here you can see Caltech lyricists and car tinned University of Louisville, Oregon, Rochester and Virginia and I'll call out here that We have both five Universities as well as a research group as well as two Organizations that handle and manage hosting so we were very intentional about choosing institutions of varying sizes and That use different kinds of systems so that we could try to understand what the capabilities are in different places and Look to fit pass into as much as possible all of those scenarios and also working with tinned and with lyricists To understand what are the requirements for moving pass into being a hosted platform? So not just something that is expected to be run by a system set of systems engineers at a You know at a large institution, but can be utilized in smaller institutions as well So looking ahead again, we have a three-part initiative The first of those is to really focus on federal agency Deposit so we've been in touch with with Martin of course and others at NSF and DOE and really want to move forward towards an automated deposit capability in that context We see a real need for that and not just at NSF But across the spectrum of agencies and would like to work with a number of them NSF is of course a Major goal because so many of the institutions represented here have funding from from NSF We would also like to ensure that we have a consistent and modular system for Capturing information about grants at every institution that we can bring into pass so that there is a consistent method for Pulling the data and information the past needs to function Into the system so that then it can be utilized what we discovered in our series of conversations over the last six months was that there's a huge variety and there's very little consistency in the way grant data is managed across institutions so looking for something right now that is as generic as possible and The third being working towards additional deposit Repositories at institutions, so we have a number of these represented by the groups And we're very excited to move towards having a larger number of institutional repositories That can be part of the past system. I had a great conversation with with folks at Caltech yesterday about Potentially integrating with the NVIDIA RDM system that they are utilizing So that seems like a fantastic next next step So as I wrap up, I just want to say thank you to NSF for really kick-starting this project and allowing us to make all of this Progress over this time. So thank you. So our third presenter is John Chidaki from California Digital Library Thank you Yeah, so I'm gonna be just talking today about machine actual DMPs The project that we're working on within SF support is called SMART DMSP It's really experimentation Project that's thinking about piloting different approaches and streamlining the way that we're capturing metadata and pulling information from data management plans and Distribute Demonstrating sorry demonstrating the value of that to our communities The PI for this is Maria Pretzellis. So I'm representing her today. She wasn't able to make it She and I both work at the California Digital Library on a team called UC3 the UC curation center That's focused on these issues around research data management persistent identifiers digital preservation And we have several partners that we're working on on this project CDL is the is the main group Working, but we have partners at ARL and at data site that are also subawards on this work So it's it's definitely a team effort So what is a machine actual data management plan or a data management and sharing plan? It is just in very clear terms as Taking what we know of as a data management plan and structuring it So really thinking of this technically you can think of it as you know JSON formatted information that is common when people are doing planning on projects The reason why people are interested in this and why we are working on this project is because we want to really enable all of the systems and research supporting systems and people in our ecosystem to leverage the really robust and Meaningful information that's captured in these documents and making sure not only that they're available to machines and people In real time, but also that they become really what we've always wanted DMPs to be which is living documents that gets Versioned and changed over time. And so the project that we're working on is really thinking about how we We can we can get there with them within Systems and tools so why are we talking about this now? I mean, obviously, I'm sure everybody here knows there's a lot of policy discussion happening right now around what is changing? Not only in federal agencies, but then the trickle-down effects of those into our institutions and into research and this this journey that we're on of Seeing where the current policy landscape is going to take us is definitely something that has influenced the work that we have here in this This grant project, but also the work that we're doing in general with machine actual DMPs and Again, we as institutions are regularly looking at not only how we can track our outputs for our own understanding and looking at our own Investments and in our own impact within our communities, but also tracking Is important for us to be able to showcase compliance and our conversations with funders like the like federal agencies and so with all of this kind of In mind the the time for the project that we are working on is, you know, it's very it is very timely It is also because we want to facilitate Change we want to make it so that Information about research projects is more easily retrievable. We want to create systems that are based off of Primary sources of the researchers plan To understand and be in and be notified and to verify What's going on in research? Many of us will know that it's you know, if we told our non Institution-based friends that we at institutions don't really know what's going on in research labs across the campuses They'd be amazed because they assume we know this information and we should know it and we do have access to many Documents that would inform and it's time for us to be able to retrieve that and and build that into our systems And so this is kind of what we're trying to do is facilitate that change So getting into the the very You know nitty gritty of of what we we are building towards At California Digital Library, we run a project called DMP tool that many of you use and leverage on your campuses That is just one of many Data management tools and is that are out there in the world Here's a series of logos that we work with It is a community of people who have built systems in this space And one of the challenges that we started tackling about five years ago is okay As we're moving towards this machine actionable DMP world. Well, what is that JSON format? What is that structure? What is that markup gonna be and we launched a RDA working group of what we call the common standards for data management plans and came out with a published Standard for people to use when they're Interchanging information that one would find in a data management plan And so this is something that I visit maybe one of the key takeaways maybe is in the Conversations that we have in this space we we do regularly talk about how there are more People working on different things and wondering if they're competing or cooperating and maybe one takeaway is that all of these logos Cooperate we talk all the time the common standard is something we co-developed and all of our systems are being Retooled right now to leverage the same standard. So this is something that it's important for you as part of the community Understand that we are actually on the same page. This is not a this is not a competition But what is in that markup like what is it that we're we're tracking and I It is the information that you would assume it is what are the data types that are coming out of research What are the tools and softwares that being used where are the standards for those? Systems and tools and outputs that researchers are creating. What is their preservation and access plan? And then what is the oversight over time? I think these are the main components that you'll find if you go into the markup and the structures of this exchange format that we've created and Really the idea is that with that exchange format. You can start to build integrations to facilitate guidance Which many of you use? DMP tool now is a way to create a discussion with the researchers on your campus That same kind of guidance can be optimized and streamlined with integrations with the machine actual DMPs To as I said before Compliance but also around promoting research integrity and really being able to understand and and see what is being created So we can track the impact of research So what integrations are we working on with this project? We have we are we have taken the DMP tool And restructured the back-end system to use that common standard as the the basic Structure that all DMPs that are created within DMP tool Get done We're using that standard as that the kind of database fields that we're using to store and leverage and create The systems and the the structures for data management plans right now so what we're doing with this project is taking those that structure and and looking to Integrated into key systems in in in a campus setting and in a research setting so looking at structures for rims systems and looking at the best way to get into Electric electronic lab notebooks and just kind of thinking about what are the main stakeholders and you know Computational environments and these types of things also looking at ways that grants databases are leveraging and managing Systems and information so that things can be up-to-date and as I was talking about before notifications and verification can take place We have Started prototyping out work right now if you are DMP tool member you or user you can go and you can see we've started prototyping out what landing pages would look like I mean what we're really trying to do here is have the information that is in data management plan be as public as possible So we have a lot of controls on privacy, but we want to make sure there are ways for people to actually use persistent identifiers to resolve To a landing page to see what are the outputs from projects and so this is an example where you know of a grant project or That that had a DMP You know it has a landing page It has a place for people to go and as outputs that are tracked that are found that have been published out if it's journal articles or data sets or Protocols or software that they are listed at the bottom of this kind of a page And you can see these kind of this kind of prototyping happening right now if you're if you're in the DMP tool and also Obviously integrations Require API's so I mean we are spending a lot of the time and energy with this this this project to to integrate with Systems through our API and so we actually have our API Well documented if you can think of systems that would be good integration partners If you have ideas or if you have somebody on your team or you would be interested in looking in the way that we're integrating through The technical side of things, please feel free to to jump over to our API documentation and really You know what we what we're doing here is trying to think about how we can pilot this these systems on campuses and We at CDL received us an award From IMLS recently and so we've been combining those resources and those those planning to work across the two awards On the idea of piloting with institutions As I mentioned before we're partnering with data site on on the development of how to use persistent identifiers and DMP IDs We are working with ARL on staffing Support for helping with these pilots and really trying to make sure that we are connecting systems as much as possible and We're exploring with the community how this new data model for the common standard for for data management plans can be leveraged and This as all of us are you know We are definitely looking at as a key component of this project to leverage machine learning on What is that researchers are trying to convey with their plans and how we can make the whole process much more streamlined and understandable to them That's it Our last presenter is also from Johns Hopkins David Elbert a material scientist and I'll say his Marta project. I'll embarrass him again by relating that it has the distinction of The single most largest number of peer-reviewed papers generated in the NSF public access repository of any project in NSF history Thank You Martin. That's probably not true, but We'll move on and if it is it's only true because of my collaborators. So let's see if we can make this slide show to start So as Martin said I'm here I'm David I'm here to represent one of the RCMs in material science and I want to say that an important part of this panel is this idea of catalyzing broader use of data-central to Addressing the types of things that are important in different domains for us in the materials world There are really these sort of fundamental materials problems We work on figuring out how a material metal a ceramic a polymer works and then expanding knowledge of how we might advance new designs Or novel discoveries and use things like AIML to accelerate that the Venn diagram here is a perception I have of the way workflows are we talk a lot about workflows And there are really three different kinds of workflows that a lot of people I work with don't recognize They're focused on their own and for us we have lab data production the place where data is produced in experiments or modeling We have data analysis and visualization There are different workflow tools that you might use there and then data curation and Sharing of information that would be yet another type of workflow that we have to deal with So data is growing Exponentially in the materials world and our main framing is something called the materials genome initiative the MGI which came Out of the Obama White House about a dozen years ago now and Is administered out of Office of Science Technology and Policy in the White House? On the left is a graph that shows we have sort of an inflection point of this exponential growth in digital use and and Data use around the time the MGI formed But I want to emphasize that the growth and data use doesn't necessarily Correlate to a growth and data understanding or access. This is still mostly people working with data They're creating mostly in their own projects the figures on the right are sort of the overall views from the early MGI strategic plan on top and the most recent one what we call MGI 2.0 on the bottom and really the point there is to show you that there's always been this idea that there are a lot of things That are interrelated and threads that connect But how do we actually connect them and make those things work? The science data landscape is changing We're automating everything we have high throughput in our labs if we take automation and combine it with decisions We have this opportunity to move towards autonomy Something that might be called self-driving labs although Clifford Lynch took issue with that term in his opening remarks It's a term that's out there. It's evocative in the community likes it but a real key here is that linked data is Absolutely important to be able to look at the entire history of what we do across the entire research landscape Not just packaging up some fair release of data that's associated with the figures in the publication We came out with last week the problem We find is that scientists are like the worst judges ever of the reuse of their own data The science is a little bit myopic we tend to work on a lot of different stuff. That's very difficult We have Increasing fire hose of data. We have great new detectors big facilities We have fast data faster than ever bigger variety of data larger amounts of data And we collect it in a much more distributed research Enterprise than we ever did before the same graduate student will go to the synchrotron that works on the TEM downstairs in the lab That does something in a thin film deposition type place So we get a little myopic these techniques are hard people have to learn how to do them And they really don't have time to see the whole forest sometimes for the trees and So we have worked hard at building community efforts to try to link together data stakeholders that go just beyond the R&D people or the researchers in the labs and the figure on the right is a somewhat fanciful representation of one way to define a bunch of stakeholders the computational facilities the labs the pIs the publishers the repositories the libraries all these types of people who have some Relationship to materials data that we need to network together and Marta materials research data alliance formed about five years ago now We're coming up on our fourth annual meeting in February big plug everyone should go. It's free. It's online We'll talk about it later And then we have a new Martian the materials research coordination network Which is the major effort under Marta to actually fund some things and make some things happen and the most important part I think of the way we have pushed these issues out are actually encompassed and and and incorporated in MGI 2.0 where we are goal one objective to to establish a national materials data network in other words people are Infrastructure to it's not just software and hardware in fact Maybe people are the most important Infrastructure and so I just want to emphasize that it is a grassroots community driven effort We have everybody voting with their time and they're all volunteers We have over 350 members now and it grows But the idea is that we also have a governance council that helps make things actually work who volunteer a little more time to Get things going and who are liaisons to funders and policymakers because there's quite a Disconnect between people doing the work and people sitting at meetings like this or at the RDA or other places And I really want to emphasize some of these people in particular the photos on the right are the members of the governance Council right now who are all brilliant in their own fields, but they're also brilliant at putting the time into Understanding what other people are doing and networking people together Annual meeting third week of February Tuesday through Thursday half day virtual online Marta Alliance org You'll find the information you need So Martian our research coordination network is One that is a subset of that council some people who came together to try to Start working towards sustainability of this organization by actually pushing more outputs out and helping to guide things a little bit We're rather distributed both in location, although sorry Pacific time zone We don't have anyone there yet We have members and members on our councils, but not on this particular award We have a focus on some education in MSIs and the like and we really sort of work in four different areas fair data Fair models and workflows fair training, which is a workforce development that really is focused on MSIs at this point in time and then fair impact because we really think that it's important that the Impacts evolve as the work is done in these volunteer working groups, right? We don't want to have people have one end game because the target is always changing So it has to work and you need to include people and roll out Impacts as you go rather than wait for a final report and then shop it around looking for someone to do something We work with a lot of meetings of various types. There are two listed there on the right working towards metadata in something That's high value in the materials world Electron microscopy metadata almost everyone does electron microscopy of one sort another and has laboratory management to do So we have virtual meetings We have in-person meetings and we're experimenting with a number of different things doing only virtual things Bringing people together actually paying for travel and getting people where we need them to go to try to incentivize it I want to emphasize three impacts in different working groups that have come out of this and the first one is a Roadmap that's come out as a perspective publication piece in MRS materials research society bulletin And this one I think is interesting because it's different than a lot of roadmaps There are a lot of things written about data and data in the materials world Here what we tried to do is simplify things keep it small and give people at all different levels you know, there's a research data cycle people think of the Life cycle of research data and their places along it There's also a life cycle for the researchers that are involved in the people who are the stakeholders They are at their own place along that and you can't simply say well here comply Meet some checklist become fair do all these things you need to meet people where they are so across the top on that road map We have individual things that people can do from very simple level one level two level three They can do that in their own lab They can participate that in a bigger organization and then when we have what we have envisioned as community Things that people can work on together like targeted high-value data sets Software and the like second working group that's been kind of brilliant for us Has been one that came out of our annual meeting Two years ago where there was a lot of talk about metadata extraction most of us automate metadata extraction Metadata entry forms are where science goes to die Nobody wants any more of those than needed and so we're producing digital data So we might as well take the header of the file and get what we can get out of it And so we were all doing that and we thought why are we all doing that when we have the same instruments? So what we've done and really it's the brilliance of Matthew Evans at move on it quickly became an international project and Peter Kraus at Berlin and we created a system where it's basically sort of A layer with an API a schema and a registry and this now is coded up And so with a simple YAML you can decide or you can describe with the metadata in the header of your x-ray diffraction Instrument or your scanning electron microscope looks like or so on and so forth and we can share those things openly And this will roll out in the next month or so and then one that's been exciting to a lot of people because there's a The numbers are impressive Everybody's very excited in the last year about large language models Everybody's buzzing about chat GPT and especially early in the year people didn't quite know what to make of it There have been sessions here teaching us more and so Ben Blazik Who's one of our co-pis brought some friends together to organize a hackathon an online hackathon and the outcomes here? We're really interesting. We had a lot of people participating because it's popular and was fully virtual So it was easy to do we had submissions and then we had 10 that really submissions that really went farther I think the thing that blows people away is that we had more than 1.2 million views on social media And that's really the brilliance in this case of Ben Blazik for saying all the submissions have to come in on social media And you tag them with the hashtag for Marta and you tag them with hashtag LLM And like in March if you put hashtag LLM on anything You got a million views within a week because everybody thought they were gonna just change the world for them But it's important because it like gets the word out and it builds our community and it lets people come in and see the different Kinds of things we can do and of course they continue to be important I'm gonna switch gears really quickly and talk a little bit about some of the science that I do in my own group and is in Important across the community and this type of work of building the community catalyzes the MGI diagram shows this loose idea And everybody puts up a graph schematic of how all these different types of data connect. So this is Ganon Murray who was an REU working with me this last summer an undergraduate Earlham College And I think what was really interesting is we work at all levels when we build community, right? And so Ganon came in to synthesize some things in this case ortho vanadates yttrium ortho vanadates Which is a laser crystal and hard to make in great purity. And so we had him work Simultaneously in the lab learning how to synthesize things and try new things with coding up a graphical data model that we use called Gem that was developed at citrine informatics and I worked on some of the early stages of Which really works in terms of the currency that material science work with we have materials They become ingredients we put them through processes and then we characterize them And then we take that material and it might go on through the cycle to do something else And so Ganon was able to not only learn to make things but code up a graphical model of it in a Python notebook and Pid's with everything that was used in it And the beauty of that is not just that he came up with a model that made sense And this is Ganon slide. It's one of the most exciting slides I have and I use this and came up with it completely on his own in his final Presentation and so I love it and he said look I did these syntheses and now I can query them I can look I can see relationships to them But what really becomes important and exciting is when not only I do this But when the whole community does this because now it totally changes how we do the science Because now we not only see a full material history from the conception of making something through the steps of making that thing To the way that we characterize it in the properties that it has So scientists live in a forest that they've never seen But once we start connecting their data across the entire enterprise we start to make the invisible visible We've taken this further into full project data the graph on the right has about 8,000 links in it But we have some that go over 30,000 links and then it becomes a really cool Visualization problem right it looks nice as a cloud But we can actually think in terms of all those processes just like Ganon did and we're rolling out tools to do that sort Of things and of course it only works when we have the whole community involved So three things I want to leave you with on this really quickly There are a lot of things on here There are the different workflows and and I highlighted in yellow public access and open science are not obvious Objectives in many sciences where IP is involved and in the materials world people want to make stuff They want to patent it they want to make money off it and industry wants to roll it out So it's tricky to build that community But public access and open science are absolutely fundamental to the transformative advances in the field and Projects like the RCN and the other work that are done of public access and SF really focus on tools But also now people so that's critical to catalyzing this kind of work and with that I think we're all done and we thank you and we'll take questions from the panel. Thanks There we go. I want to thank all our presenters For their great work and provocative activities Tom Do you have a question? I do did I am I doing this right? Yes? Okay, great Bill I Familiar with past and we've talked before I got the core notion that it was a way from one deposit to put it in many places Today, I also got the sense that it is there is a link to grant compliance monitoring and traffic with a grant management system Could you say a little bit more am I getting that right and could you say a little bit more and then also? With OSTP coming and the mandated deposits to the federal what seemed to be the federally designated Repositories for articles. Are you expecting any of the physics or the drivers around past to change? Thanks, Tom So as part of the conversations that we've had over the past year with with the number of institutions It's become really clear that there's a lot of interest in the compliance aspect of what past can do And recognizing that as you have a greater number of your Researchers moving through a single system to handle deposit. There's a greater opportunity to understand from a compliance standpoint What does that mean? What does that look like? So that's not something that's really built into past right at this moment But it's a recognition that that's something that past enables and and so we're looking to how do we move that into? in a direction of ensuring that we're Utilizing that data appropriately and usefully and providing it in in a way that makes it Something that actually can be used in that in that way for for considering compliance Understanding you from the research administration's perspective what that looks like I think To the second part of your question I think there's a lot more to be determined around how what that's gonna look like and a lot more discussions to be had So I mean certainly there's opportunity to speculate there But moving moving in this direction feels like the the right way considering that there's a lot of interest in that in that model If there aren't questions, I have a question for at least our to RCN awardees here That has it has been interesting to observe the interactions of all the RCNs in the monthly calls that we've hosted for a while and other sort of Inter or between project Communications do you do you want to comment on that just from your perspective? I'm interested in any comments that you might have on that Yeah, I'm happy to comment on that first of all like everyone else Martin when you put another meeting on my calendar I thought oh, but it's been a great meeting actually because For those who haven't looked at the list of the other RCNs They cover a range of things and some you know You will overlap with like there's one on PIDs for instruments and facilities I've been to their workshops and we already link in and have started minting PIDs for things that We just wouldn't have gotten around to were thought of and I learned about things But then there are things like Kathleen's project and others that are not Natural overlap perhaps from what we would think and it turns out there's tons of overlap because there is a Lot of commonality in the issues we face You know I've asked Kathleen specifically in the humanities I I knew that in my field proprietary information and stuff was going to be really big But I didn't know that in the humanities people would feel ownership over their data in the same ways And so we don't necessarily find the same solutions But we commiserate and then eventually I think we find common ground So I think it's been really helpful from my perspective Yeah, I think that has been extremely helpful and it's also been really great to see the the wide range of interpretations of both of the coming together of fair open science and the research coordination networks To think about how those components come together And how these projects might interoperate and and mutually support as we go forward So it's been a great opportunity to be in contact with those other projects Are there any questions for anybody on the panel or about the public access program in general? And if not we can give you a few minutes back before the break So thank you all for attending and we I want to thank our panelists again. Good job