 Hello and welcome to a presentation on the library led research information management system at Oklahoma State University collaborations and challenges. This presentation is created for these coalition for networked informations project briefing video series. I'm Clark Iacovakis, scholarly services librarian at OSU, and I'll be joined in a few minutes by Megan Macken. The two of us, along with Matt Upson, associate dean at the OSU libraries are the primary project managers and administrators of the system. This session will focus particularly on the collaborations between the library team, campus it and data owners across the university and configuring the data feeds. We'll provide a number of examples of challenges we confronted as we migrated personnel grants and teaching data from local databases to the new system. This session will include solutions that we developed for mapping complex data structures addressing privacy considerations and handling dynamic data that changes across time. Research information management or RIM system is a software platform for the collection storage and linking together of metadata related to faculty service grants teaching and scholarly activities. Essentially, it is a single point of access that is typically stored across multiple internal and external platforms. So for example publications data may be harvested from external databases like Scopus PubMed or orchid personnel grants and teaching data may be fed in from internal databases. Data migrated from existing RIM systems on campus data added manually by individual faculty. The advantage to having all of this in one place is that it can save a lot of time and increase the accuracy of reports that need to be created by universities for multiple reasons. These systems can also help researchers, funders and members of the public identify expertise within the university and they can enable faculty members to communicate their research focus and activities more effectively. At Oklahoma State University, we began implementation of our RIM system which runs on symplectic elements in late 2019. The project is supported by the Office of the Provost and the Vice President for Research and implementation is led by the OSU Libraries in partnership with IT and other university staff. From the beginning, librarians have managed the publications harvesting where we've gathered Scopus author IDs, orchids and other publications data, which gave most faculty a solid baseline of publications linked to them from the outset. IT configured our single sign on which is CAS and added the web addresses to the OSU domain. Three other data feeds were to be generated from databases in the university's enterprise resource planning system, which is Banner. This includes personnel information which is managed by human resources, grants data which is managed by the Office of Grants and Contracts Financial Administration, and courses taught which is managed by the registrar. IT has been heavily involved in all conversations from the initial exploration of systems as the vendor visited campus and provided a demo through to implementation. They asked a number of questions relating to the system, for example, single sign on options where the data is stored and backed up, what the technical architecture looked like, how data transfer is secured, and other important technical questions. Satisfied with these responses, OSU opted to contract the vendor to host the data. Hosting with the vendor allows them to manage all support questions, upgrades, backup and disaster recovery and so on. The contract with the vendor stipulates that the university retains ownership over all the data that the vendor will not provide the data to any third parties, and that our data would be destroyed upon termination of the agreement. Furthermore, faculty users can set any object that is linked to them to private or internal and can also control the privacy level of their profiles. Elements also provides for a number of roles within the system that gives flexibility over who has access to what data. Access to the underlying reporting database is held by a small number of administrators who can create custom SQL queries and make those queries or the results of those queries available to select individuals. Within the web application, people with the statistician role can run these reports over the university as a whole or different colleges or departments based upon their level of permissions. For each data feed, two librarians, Megan and I, an IT staff member and a data owner or a staff member with access to and knowledge about the data was brought on board. We also contracted with Simplectic to provide guidance and a process for formatting the data to be imported into elements. Each member of the team brought specialized skills and knowledge that helped considerably to help keep the process moving forward and to address challenges when we confronted them. For example, librarians were the project managers, we brought our experience with data mapping, building data structures, creating crosswalks and managing metadata. Knowledge of the system itself as Sim as system administrators, we received training and developed extensive knowledge of the system. We maintained contact with faculty administrators and other users of the system, and we are faculty members ourselves so we're also users of the system. And we also work to translate information and clarify goals between different stakeholders. IT of course brought their technical knowledge and wrote specialized queries to extract data from banner. They reshaped that data into the format that was required for import into elements. And they also created an FTP process for placing the data on Simplectic servers, and also retrieving log files and making those available to us so that we could troubleshoot any problems with the data feeds. Finally, the data owners brought their knowledge of what data we had in banner, what data fields were available, how complete it was, how far back it went, how accurate it was, and of course any privacy considerations as well. Elements is a private system by default and only users who are on the HR feed will have profiles and only people currently active at the university can log into the system. So our first step was to construct a file including these users. In consultation with HR, we established a criteria for who would automatically have an elements profile based on their employee classification codes. We configured this with IT and HR to include all faculty, including tenure track faculty adjuncts, visiting professors and lecturers. The feed would also pull in the individual's college and department, their email address and phone number and some other information. In consultation with HR, the feed was configured to refer to a flag in banner, indicating whether the faculty members should have any information about them kept as confidential, or if the person should be left off the feed. The feed is deposited nightly on symplectic servers to be picked up and run, updating data and reflecting changes by the next morning. So for example if someone's phone number changes or if they leave the university, the data will be updated relatively quickly. There were several challenges in constructing this HR file and I'll review some of those here. The department level data was somewhat challenging in that faculty serving as deans would be in a provost group rather than their academic departments. We therefore pulled both the department in which a person was tenured, as well as their administrative department, and Megan constructed the groups and group hierarchies based on these codes. We largely resolved the issue, though we do have a few cases that fall in the margins, such as administrators who come to OSU with tenure, who may not automatically have a tenuring department, or people who earn tenure in one department, but then transferred to another. In addition, there are people who are not automatically added to the feed but still need a profile in the system. For example, Elements includes the ability to designate another Elements user as a delegate, which allows that person to edit data in the person's profile on their behalf. Some colleges and departments have designated administrative staff for this task, so we needed to add them to the HR feed so they could log into the system. Also, some colleges have research and extension staff and others whose employee classification codes left them off the feed. We therefore worked with IT and HR to create a process for appending to the main HR feed, anyone we request before it is sent to Symplectic. IT also built a mechanism to overwrite certain data from the main feed. For example, emeritus faculty, who may have officially retired, can be set to active so that they're still able to make their profile public and modify their data. These are just a few examples of solutions we came up with to address the complex and dynamic data. Finally, IT built an application to query the Elements API whenever someone lands on an OSU directory page. This query will determine if the person has made their profile public, and if so, it will provide a URL to their experts page and add their photo. This is a great example of ways to connect different tools and data sets and a product of the fruitful collaborations we've developed. My name is Megan Macken, a librarian and digital resources and discovery services, and a member of the Elements implementation team at Oklahoma State University Libraries. I'll speak now about two additional data feeds and elements, grants and teaching. The grants feed is derived from grants and financial contracts accounting records. This is another example of data that was not intended for public view. It is managed by the grants and contracts financial administration or GCFA in banner, the same information system as a human resources feed GCFA manages award related accounting and compliance, among many other things, including business income and reporting, as you can see here on their website. Transforming this data for public view required some effort, especially at the outset, when the project team for this feed work to understand the structure and usage of the source data. The project team included a GCFA accountant, a software services manager from campus IT, Clark and me. Part of the challenge was that only the accountant was familiar with the source data. He explained the purpose and structure of particular data and various abbreviations and codes to other team members, because half of the team did not have access to the GCFA system. The accountant also found examples of unique or complex data requirements, and then created exports and screenshots to illustrate them. Screen sharing made this communication a little easier as we switch from in person to virtual meetings when the pandemic began. Overall this feed required extensive discussion to bring everyone up to speed before the data feed could be activated. Long term grants or contracts renewed each year provide a good example of the unique complex data needs for the grants feed. These grants may have different principal investigators or PIs and contributors over time, and many sub accounts or funds. The grant I show here includes mentorship of additional PIs and replacement project leaders in the proposal. So changing PIs are inevitable. With this dynamic data GCFA uses a relational data structure with a parent grant that may have many child funds. In this slide you see two different examples of the parent child structure from the NIH portal on the left, which GCFA replicates and elements on the right. Initially we tried to flatten the GCFA relational data into the out of the box structure in elements, a single grant object. The documents records the individual accomplishments of each researcher by linking that person to a single object. However, this didn't allow us to accurately distinguish how much funding each individual faculty member had been awarded. For example, one professor may be a PI on an NIH grant of over $2 million, while another professor is responsible for a $20,000 fund within that grant. For our initial flattened transformation, it appeared that all faculty on the project had been awarded $2 million. To resolve this we worked with the vendor and campus IT to create a custom solution. We now create separate child objects for grants that may have multiple funds. This way the cumulative funding amount for the ongoing grant is available, and faculty may be linked to the specific funds to accurately reflect their contribution. Correct attribution is especially important here, not only because the system generates public profiles, but also because the RIM system is used to create annual and appraisal and development reports and other faculty activity reporting. The fields within each grant or fund required some customization as well to map from the GCFA database to the RIM system. In addition to built-in fields, we created new fields for the various codes, departments, programs and agencies, including pass-through agencies, used to track grants and contracts. We fed in only the minimum necessary data needed for accurate reporting and for following up with GCFA on particular grants and funds. The vendor provided a set of spreadsheets to record decisions about mapping data from one system to another. While this process was familiar to librarians, used to migrating data between systems, it was new to other team members. The library team guided the process between the data managers and the vendor by interpreting system-specific terms for everyone and identifying appropriate target fields in the new system. When the data structure was created and the fields were mapped, we encountered additional hurdles during the migration process. Initially, delimiters within the GCFA data caused errors in the CSV data transfer. Additionally, publicly sharing the data revealed errors accumulated over the years that were previously inconsequential when the data was only shared internally. The errors were generally in spelling, capitalization and titles of grants, and sometimes contributors needed to be updated. Because the grants data is fed into elements, faculty users cannot edit their own grants data. Our solution for addressing these issues was a user ticketing system built in Qualtrics. Faculty complete a form which is automatically routed to the campus IT ticketing system. Grants inquiries are assigned to GCFA staff who edit the data. The requests to make grants private or hide them from public view are not routed to GCFA, but instead handled within elements by a librarian system administrator. Additionally, error logs notify us that some grants fail on import because they have a PI who does not have a faculty classification picked up by the HR feed. When appropriate, these users are added manually to the HR feeds, as Clark described earlier in the presentation. Once the grants feed is set up to pull data on a weekly schedule, these updates do not appear until the next weekly run. Finally, any grants connected to people who for privacy reasons are excluded from the system are not imported into elements at all. The third data feed brings teaching data into experts directory from banner, the same system as HR feed and grants feed. This data is dynamic throughout the semester added additional complexities to the system integration. For this feed Clark and I worked with the registrar's office and campus IT staff who were all well acquainted with the data. This made structuring and mapping the data into the RIM system a much smoother process. We customized some fields but not the data structure. Instead, it was more challenging to determine when the data was finalized and to establish a schedule for transferring it into elements. The teaching feed to run at the beginning of each semester three times per year to load the data for the prior semester. Classes vary in length. For example, some classes run only eight weeks long starting in the middle of a semester and instructor sometimes change after a course has begun. And of course students drop and add classes throughout the semester. We do not have student users in the system but we do include enrollment counts for the individual courses and elements. This information may appear on reports including annual appraisal and development reports. And while some faculty expect to see the classes they are teaching to appear in elements as soon as they begin teaching them. Most are content with the schedule that produces more accurate reporting and instructors who would like to have their courses in the system sooner may create temporary manual records. Only some colleges keep only local records of their clinical courses taught by dozens of co instructors. As a result, this local data does not appear in the teaching feed automatically and must be manually entered. We provided some guidance to the administrators for this college for working around this issue, including the recommendation that they explore avenues for moving that process into banner. This research information management system implementation involves similar processes for the human resources grants and teaching feeds. However, the data documenting faculty activity on campus is complex and dynamic. While all of the respective data sources derived from the same information system banner, the structure tidiness and management of the data varied widely. There has been clear intentional communication with stakeholders across campus, including the registrar human resources, financial administration, and the faculty whose data was represented communication across all of those groups was vital. The library led implementation team, who are also faculty members served as project leaders, as well as conduits between data managers and other faculty users. This positioning enabled us to work through obstacles, such as the examples given here in shaping the data and establishing workflows. Effectively to build a room system that supports faculty needs for annual reporting and for public profiles, and to configure a system that fulfills the reporting needs of departmental and college administrators and for the university as a whole. Data structure impacts activity reporting over the long term, and may also influence the perceived performance of faculty and departments. For this reason, the extra effort we made to understand the nuances of each data source and to create avenues for rectifying inaccuracies was an essential part of developing our room system processes. Thank you for this opportunity to share our project with you. We'd love to talk with you more about it. For more information, please contact us directly or visit our support site and public profiles page at experts.okstate.edu. Thank you.