 get started. Thanks for joining us today. I'm Cliff Lynch, I'm the Director of the Coalition for Networked Information, and I'll be introducing the session. This is one of the project briefing sessions from the second week of the CNI Fall 2020 virtual member meeting. And I just want to note that for this week and subsequent weeks, we are including not only live project briefings, but also some pre-recorded presentations. And the ones for week two of the virtual meeting have now been released. So please enjoy those as well. A few mechanical things. We are recording this session and it will be subsequently available. Close captioning is available. There is a chat box and a Q&A box. Feel free to use the chat to comment or to introduce yourself. The Q&A tool is available to queue up questions for our speakers. And there will be a Q&A session at the end that Diane Goldenberghardt from CNI will moderate. And I think that's all I want to say about the mechanics. So let me introduce our speakers. I'm really pleased that we have with us today Patrick Schmidt, an old colleague from UC Berkeley. And now with a consulting firm, Semper Kajito, and Claire Mitsumoto from the University of California, San Diego. And they're going to talk about a pretty important issue, I think, that really has, I know, bedevilled some of our member institutions. When you look at a fully realized research data management support program, for example, it seems like the needs are infinite. That the resource demands are incredibly immense. And everybody asks themselves, well, you know, we know we're not perfect, but how are we stacking up to other institutions? How mature is our program? And that's been a tough thing to get a handle on. And I think we're going to hear today about an approach to get some real insight into that. So with that introduction, let me hand it over to Patrick to start the presentation. Patrick, Claire, thank you so much for joining us today. Over to you. Thank you, Cliff. And I want to thank the CNI organizers for allowing us to share our work with you today. This has been a lot of fun working on. I'm Patrick Schmitz, as Cliff mentioned, I'm with Semper Kajito. I do consulting around research IT assessment and strategy for higher ed. I'm the co-chair of the CARC working group that supports this work. Claire Mitsumoto, my colleague, leads research IT at UC San Diego. She's also my co-chair on the CARC working group. She's the co-coordinator of the research-facing track of the people network, which you can hear more about in an hour. And it's also on the EDUCAUSE Research, Computing, and Data community group steering committee. So I want to first define what we mean by research, computing, and data, which we tend to abbreviate as RCD because we're lazy. It includes technology, services, and people supporting the needs of researchers and research. And it's intended as a broad inclusive term covering computing and data, of course, but networking and software as well. National Science Foundation often uses the term cyber infrastructure. Others use research IT. This slide is important to note the key involvement of a number of key partners in this. Internet too originally seeded these ideas and got the discussions going. And EDUCAUSE support includes the research computing and data community group and the team that supports the core data survey, many of you will be familiar with it. And it helps with extra piece in this kind of data gathering. CARC is facilitating a working group that has developed tools and supports the work in an ongoing manner. And I expect most of you are familiar with Internet too and EDUCAUSE, but you may not be familiar with CARC, so I'm just going to mention briefly. We're an organization of dedicated professionals developing, advocating, and advancing campus research IT and associated professions. Some of the things we're working on right now are connecting the broader research and computing and data ecosystem, professionalization of our work in an HR framework, and again this is going to be presented in an hour by some of our colleagues at CARC. We've been working and produced a report defining the stakeholders and some of the shared value propositions for the community in this time of accelerating change and then finally, of course, contributing to this capabilities model for research computing and data. So often we're faced as Cliff mentioned with some of these questions of how do we get a comprehensive view of our support? How do we represent the state of our research computing and data services to our leadership and key stakeholders? And are we missing something? What should we be paying attention to that we might not have thought about? As we think about strategic planning and funding and hiring, how can we identify areas for improvement and growth? Well, this is really why we built the model. It allows institutions to assess their support for computationally and data intensive research. The structured model facilitates the discussion among different groups, roles, and service providers. It helps you identify gaps. These gaps may be on your radar already or they may be in areas you weren't thinking about. The model is developed by a diverse group of institutions with a range of support models in collaborations I mentioned among several organizations. The assessment tool is designed for use by a broad range of institutions. The structured model shared across the community provides a valuable data set for understanding broader trends as well as patterns among different groups, for example, R1 versus R2s and others. I mean, as I mentioned, it's designed for range of roles and stakeholders from folks like us who are doing campus research computing and data to PIs and research team members, e-partners ranging from central IT centers and institutes on campus and libraries, of course, and then campus leadership as well. And we designed it to be useful across these institution types and institutional organizational models. Now that last point is key. We try to be agnostic about how your institution provides services regardless of whether it's central or federated and instead we emphasize what researchers have access to. So what is the tool we're talking about? This is a screenshot of our V1 prototype of the tool that's implemented as a spreadsheet which allows for collaborative effort as you fill it out. Along the left side, you see a series of questions that it poses for you to consider. The answers yield a computed coverage value where those values are color coded to provide a heat map view of your assessment ranging from green for good to red for some of the gaps. It divides areas of capabilities based upon a concept called facings with one tab for each area. Let me take a quick aside to explain this important concept. Facings came out of a workshop in 2017 and the idea was subsequently adopted by the capabilities model as well as some of the professionalization work going on at CARK. There are five facings that we talk about here to divide up the kinds of work that we do and the roles that we have. So researcher-facing roles include research computing and data staffing, outreach and advanced support as well as support of the management of the research life cycle. Data-facing roles include data creation, data discovery and collection, data analysis and VIZ, research data curation, storage, backup, preservation and transfer and research data policy compliance. The software-facing roles include package management, research software development, research software optimization or troubleshooting, workflow engineering, containers and cloud computing, securing access to software and software associated with physical specimens. The systems-facing roles are probably the best understood by many organizations because they include the folks who are used to infrastructure systems, system operations, system security and compliance. And finally the strategy and policy-facing roles. Now these include institutional alignment, setting up a culture for research support, funding of course and partnerships and engagement with external communities. So coming back to the tool then, there's a couple of different lenses that we ask you to consider as you're looking at each of these questions. One of them is the deployment at your institution ranging from the no-deployment or support, we just haven't got anything for this, haven't thought about it, to we're tracking use or in a pilot stage on up to deployment to parts of the institution like maybe a college or if you're really mature you've got deployment institution-wide for a given area. Another lens borrows from ideas in IT service management to think of up a service operating level. And again you might have no support at all or you might be running something but it's on a grad student with a grant running out in six months where you're at substantial risk of failure to a lights on only which is a fairly common thing in a constrained budget regime or you might have a decent service-based economy or if you're really well funded you might have a priority of premium service where you're always adding new features and things like that. So those are the kinds of things we ask people to think about and those roll up to this assess this coverage value. In addition to assessing your current state the team can mark specific topic areas for attention. This was something that was requested by our community and you might note that you may have low coverage and not make something a priority, may not align to your institutional priorities or even with good coverage you may want continued investment and development. So as you complete the survey the summary sheet that's shown here automatically rolls up the different areas into a view that you can use in discussions with leadership. You keep stakeholders and others there's even a bar graph you can grab for reports or presentations. This is another request from our community. Heat-facing has a summary coverage value and then you can drill down one step to see the major themes within that facing providing that high-level view of your support. And here in this hypothetical example for you'll see that there's this group hypothetical group has no or very little coverage in workflow support and that's called out in bright red so you can easily see that gap. You mentioned the priorities marking feature the summary view also gathers up the items that your team marked as a priority. Again just to make it easier to feed these ideas into your strategic planning work or to generate a quick report for your leadership. So here for example you'll see that a number of workflow areas not surprisingly were marked as priorities. So who's using this? At this point we have over 125 institutions that have downloaded the tools and are exploring them. That represents 44 states around the country and a couple of Canadian provinces and as you see there's a pretty good mix of Carnegie classifications, public and private institutions. We would like to do a lot more outreach and and have more minority institutions using this but we've been working on that and we're beginning to see some that have started taking this up. Among those folks we asked them how do you think you're going to use the model and you can see that many of them were really interested in benchmarking and as well as strategic planning and then especially emerging centers noted that it was really helpful to have a model to understand what common practices were when they're thinking about doing their planning for building up a center or applying for funding and things. So we gathered up a community data set this year and of those folks that have downloaded at 41 were able to complete their assessments and contribute them to the 2020 community data set. We're still in process of analyzing it but you can see here and I should mention the institutional demographics of the participating institutions are largely comparable to those that I showed for the larger community that have downloaded but haven't yet necessarily completed the assessment. So here's an example of some of the early data from the community data set analysis and you can see here that institutions generally do better at the system facing capabilities strategy and policy facing and into a somewhat lesser extent the research we're facing and they have more challenges around data and software facing capabilities and this is for everybody across all facing. One thing I should note here is those error bars or one standard deviation you can see there's a very broad range of variation among the different institutions and I think that's one of the things that really came out of the data. This is another way of looking at that variation among the represented institutions so in this scatter plot each vertical slice is one of the 41 reporting institutions and then the different colors represent the different facings and you'll see they are literally all over the map. There's a very broad range of coverage across institutions. Only a few institutions on note have coverage values that are consistent across the facings. There's just a couple of those. Most have fairly different levels of coverage in each facing or at least one facing for which coverage is quite different. Another point here is that there's a little commonality to the relative ranking of facing coverage across institutions which is to say different institutions have strengths and weaknesses in really different areas. So diving in again here's a view into the data comparing how public and private institutions reported their coverage. You will note that the private institutions consistently have higher capabilities coverage than the represented public institutions and this may not surprise folks a lot but it is really useful to have real data to show where we actually are. Also note that the variation among public institutions here is considerably greater than that of the privates except again in the area of system facing capabilities where there's a little bit more closely aligned. Here's another example looking at general areas within the researcher facing capabilities and here we're grouping by Carnegie classification. Perhaps unsurprisingly our ones have considerably wider coverage in staffing outreach and advanced support capabilities that our two campuses in both groups have higher coverage than other types of institutions which include masters and baccalaureate and some other groups. However looking at the capabilities around research life cycle management the distinctions are not quite so clear nor is the variance. They took the error bars off of this for clarity but I can tell you that there's not as much variation there so it's interesting how in some areas there's a lot broader distinction in other areas folks are a little more closely aligned regardless of Carnegie classification. Now I'm not sure if everybody's familiar with EPSCORE. EPSCORE eligible states are those that receive less than 0.75 percent of NSF research funding so they really get very little research funds into eligible institutions in their state and this is a program that NSF designed to try to bring them up and level up a little bit. This graph shows the gap between institutions in those states compared to those in states to get more NSF funding and while a few areas are not too much different they in general EPSCORE state institutions find themselves with significantly less capabilities coverage some key areas like data visualization and in particular you know it's really stark for security and sensitive data support. So again the folks in the EPSCORE community are really happy to have this kind of data to really ground their reality so that as they're thinking about how can they can work together and what they might want to do as a community of EPSCORE institution this gives them some data to make the case for where they really have gaps and need some investment. So that gives you some idea of what this kind of data can tell us there's a lot more we're working to complete our analysis in the report on the 2020 data set a summary report and some high level data will be available publicly the detailed data within the facings and actually all the way down to the CLEP question level it's only available to contributing institutions it's part of how we motivate people to contribute their information and this report is going to be coming out a little bit later this fall. So at this point we are really working to promote and support the current model that we have to a wider audience we're also working to design and build a hosted survey implementation so this would include a proper survey tool with a database backend and a data portal that will allow institutions to explore the data set and to relate their own assessment to peer groups and others within the community. So we're currently in the process of going after funding for that the other big point is we really want to expand our community share experiences and stories of use and to that point I'm going to turn it over to Claire who's going to share some of those now. Thanks Patrick hello everyone as Patrick just described there's a wide variety of institutions who have submitted their data for the community data set. I'll take just a couple minutes to describe the approaches from a few of those institutions who have completed the model. At each institution there was some individual who spearheaded the campus's completion of the model and they almost always had to tackle not only introduction to the capabilities model but exactly those things that Patrick just described to you the concept of an actual RCD community the facings and why it makes sense to use the model as a strategic planning tool. Then the institution had to devise a plan on how to answer the questions and use those answers going forward. Next slide Patrick thanks in order to be successful in getting the model completed each institution had to determine how best to engage the research computing and data community. At a few institutions like Arizona State University the campus advocate who is spearheading the use of the model works through the questions by answering what he could and then providing applicable portions of the model to stakeholders and subject matter experts to answer those questions. At ASU following the submission of their data the senior director of research technology Doug Genoing has done presentations to university leadership on their completed data as they determine how to use their data for strategic planning. At my own institution the University of California at San Diego a small subset of RCD professionals completed the model after I introduced it excuse me to a group of researchers facing professionals and to our research IT governance group. Patrick and I also showed it to the University of California Systemwide Research IT Committee which is made up of representatives from the campuses in the UC system including UCLA and Berkeley and the rest of the 10 campuses our medical centers San Diego Super Computer Center and the Lawrence Berkeley National Lab. We're hoping that the results will help each campus individually as well as to give us the opportunity to leverage the entire UC system data collectively for bigger efforts. There was one research institute who had just recently done a survey with their researchers they took advantage of those results of the survey to begin answering parts of the model after those initial questions they followed up with interviews and convened regular meetings to work through the rest of the questions in the model. In another approach there was a higher ed system with multiple schools including a community college in one EPSCOR state that utilized a consultant to help them coordinate through their answers. At the same time one of the national super computer centers decided to focus on their internal customers in their first year of completing their data with plans to widen their focus in the coming years. And a final example the RCD program at the University of Nevada at Reno took advantage of the small size of their program to conduct one-on-one meetings with the facing experts to answer sections of their model. So it's clear that there are many ways to complete the model it's also clear that institutions are motivated to complete the model for internal use but also we have groups who are very eager to have access to the community data set to see how they're doing compared to other institutions. We've also heard from several institutions as Patrick noted there were some folks who downloaded but didn't complete who did not submit their data on time but they're laying the groundwork this year in hopes to submit next year. So really that's just kind of a quick snapshot on the different kind of strategies people were using and we got to hear a lot about it we held office hours and we're able to kind of get a peek into how people were kind of attacking their approach to getting the answers to the model. Patrick next. All right. So thanks we really appreciate you coming and thanks Claire for grounding us a little bit in the experience of actual institutions we'd be happy to answer any questions that you guys have at this point. I'm going to stop sharing my screen so you can see us a little better. Thanks so much Patrick and Claire a very interesting model and some of those statistics were really fascinating helps to highlight some of the gaps in the need areas really appreciate you coming to CNI to share that with us and thanks to our attendees for making time out of your day to join us here at CNI. The floor is open for questions please enter your questions in the Q&A box and I see there that Claire has shared some URLs in the chat box and we will also put those slides up on the SCED page and later on the project briefing page on CNI.org. I was curious to know while we're waiting for folks to share their questions with us. Can any institution participate in this project? Are there any limitations or is it open to any? It's open to anyone. The tool itself is available under a Creative Commons license so people can get at it. There's a guide to the tool takes people through sort of the concepts in the model as well as some suggestions kind of from Claire was talking about how different institutions approach going through that. So it's all open. We've had folks ranging from as Claire mentioned a couple of community colleges on up through a lot of our wins. So we're working with the EDUCAUSE folks who are thinking about tweaking the way they do institutional metadata so that it's easier for institutions outside of the U.S. to participate since the Carnegie classifications really only cover U.S. institutions. We've had several Canadian institutions use the model and they've just sort of aligned to sort of what they think the equivalent Carnegie classification would be but that's really about the only constraint that we have. Interesting. Okay. Thank you. I think Clifford has a question. Cliff, you want to go ahead? Sure. This is just a quick one. So this potentially allows institutions to look at themselves every couple of years and see how they're trending in various directions as their research computing and data support evolves. Do you have plans to take it in that direction as well? Absolutely. I mentioned that we put together a design for a second version of this that would be survey based with a data portal and there's a couple of things that we want to make certain are easy to do. One of them is to easily compare your results to a given slice of community data. So you might say, well, I want to compare myself to our twos and our ones and sort of see maybe who are my peers and then what are my aspirations and different slices like that to compare yourself to for benchmarking and as well then to look at longitudinal data. We were interested in that both at the institutional level where you might say, okay, we did the assessment. We did some strategic planning and strategic funding. We made some pushes in some areas and two years later we want to reassess and say, well, we're the impacts of that investment and how can we look at next steps on that? So at the institutional level, I think that that data over time would be helpful. We're also really interested in looking at the community over time and seeing what are the trends broadly and that includes both rolling up the capabilities assessment, but also the priorities. It's been a little bit trickier doing the analysis of the institutions that shared some of their priorities marking data to see, okay, broadly, where are people interested in focusing and how does that change over time? So that's a lot of really hoping the community datasets and a good analysis portal will allow us to explore. Thank you. And thanks for the question, Cliff. Thanks. Well, it looks like we don't have, we do have a question. Excuse me. We have a question from Ithaca SNR. Apologies if this was covered at the beginning. Assessment is built into the culture of academic libraries, but I'm less familiar with how typical it is for research computing IT professionals to engage with their users and assess offerings in this way. Can you speak to the professional culture? Sure. Claire, do you want to speak to that a little bit since it's one of the other big tracks for Kirk? Sure. I think the biggest thing I would start by saying is that research computing in data professionals is not something that is easily identified and well-known. I think that getting our hands around who those people are who work in the research computing and data area is probably the biggest challenge for us because we are not talking about just system spacing or data spacing, quite often, data wranglers are the biggest association for research. So we do have a challenge just as kind of almost like a fledgling profession. And that is the biggest reason why we work through and develop those facings because we had to put our hands around how big the research environment is and the research life cycle and everybody who is involved. It's also the case that folks who do this kind of work sit in the middle right now. In many cases there's not good HR classifications or frameworks for them. They don't look like traditional IT folks because they have to have had experience and often were researchers who went into this. But they're also not researchers, right? They have different goals and different motivations. And so this is also one of the main motivations for the work that's going to be presented a little bit later on professionalization of this work. I think broadly to the culture that the questioner was asking about, as Claire mentioned, part of it is it's hard to even wrap your head around, okay, who are we talking about as a culture and what are you assessing? There are groups that do a really good job understanding that they have to make the case or the impact of their program. And so they'll assess at that level, how much in grants do they support? How many researchers have they engaged with? Very often it devolves into things like how many compute cores do they support or how much raw resources do they have available? But that's not consistent in any way. And it's not done all that well at many institutions. And so that was part of it is that we kept coming up against people saying, well, we don't know how we should be looking at this and how do we benchmark ourselves? And how do we understand, well, what are the things we should even be thinking about when we're looking at assessment? So I think there was a real hunger for it. People had a sense that that was a gap in terms of both running their programs well, but also making the case for funding and investment in their programs. And so that was really where the motivation started about three or four years ago for doing this work came from. That was really interesting. Yeah. And I just want to let everyone know that the next session that has been referenced during this one is coming up in about a half hour. It's called professional development and the development of a profession, a research, computing and data community. And there will be a lot of discussion there about CARK. So thanks for the question and thank you for addressing that. Patrick and Claire, that was really interesting. I think we have another question here from Lewis King. The approach seems to be a bit of a hybrid between a capability map and a maturity model. I wonder if you have considered articulating a capability map as a graphic representation of these ideas. That's an interesting idea. It is interesting and I don't know that I completely understand what you mean by capability map. That's one I definitely like to look into more. We originally were talking about it with the relatively well understood rubric of a maturity model. But partly as we looked at the broad range of institutions that wanted to use it, there was a sense that maturity model had a level of judgment in it that didn't really make sense in part because we really were looking for sort of the union of things people should be thinking about. And it wasn't necessarily the case that anybody was going to reach 100% maturity on any of this. It didn't make sense that way. And so we thought it made more sense and was going to be actually more inclusive to call it a capabilities model. Claire, do you know much about capability maps? I do not. If you've got a reference to that, Lewis, I would be really interested to know more about it because I do think one of the things we have been struggling with is how best to present some of this. We have a heat map that's in the tool itself and we're looking at some of the visualizations of some of the data. But thinking about what are better ways to help people understand what it is we're actually talking about, we'd be really open to them. Thanks. That'd be great. Right. I think you saw that Lewis said he'd reach out with some information. So that is great. Thanks for the question and thanks for your responses. And seeing as how we are a little bit past time here, I think I'm going to go ahead and close out the public portion of this presentation. I'll turn off the recording if we have a little bit of time before our next session if attendees want to hang around and approach the podium as it were. Have a chat with Claire Patrick. Please feel free to raise your hand and I'd be happy to unmute you. And thank you again to our speakers and our attendees for joining us here at CNI. We look forward to seeing you in other sessions. Bye-bye.