 This is Jerry Bain at the Coalition for Networked Information Spring 2023 meeting, and I'm here with Mark Bloffer-Sweiler, research data specialist at the University of Oklahoma, and Tyler Pearson, Director of Digital Scholarship and Data Services, also at the University of Oklahoma. Thanks for coming, guys. You're welcome. Great to be here. You guys have a lot going on. I'm seeing three different things here you guys are working on. Could you talk a little bit about the National Research Platform, the Nautilus Project, and the Kubernetes framework, and how did you learn about the NRP? So James Deaton, the former Executive Director of the Great Plains Network, was instrumental in us adopting this, and he brought our attentions to Kubernetes back in 2018, and we started playing around with it, and I think we went kind of the war-difficult route, and James Deaton was instrumental in redirecting us to keep it simple and mentioned the National Research Platform and their Nautilus environment, which is a Kubernetes environment that is NSF-funded. They've had over six NSF grants to date, tolling over $27 million. Could you talk about each of those things and what they are for people that may not know? Definitely. I'll start out with Kubernetes. It's a framework for orchestrating the life cycle of containerized applications. So the startup, the shutdown, the spinning up multiple instances of a pod, and instead of having to do that manually, you can configure this platform to do all that for you. Basically it's a conductor. It runs the pods on the resources that you've designated to it, and it goes out and finds those open resources and fires up your container. So you just talked about the Kubernetes framework. What specifically is the Nautilus project? So the Nautilus project is the Kubernetes running instance that is managed by the National Research Platform. So Kubernetes is available through a lot of cloud services like Amazon, Azure, Microsoft, Google. You can also self-host it as well. One of the benefits of going through the National Research Platform is they manage all of that. And so we are users of their Nautilus Kubernetes environment, and that allows us to focus on the application unless on the system administration. Why did you want to incorporate these tools into the university library's offerings? And how did it happen that this project originated in the library as opposed to other institutional units? That's probably my end of the house. So as the research data specialist, I do a lot of consultations with researchers and faculty, instructors, graduate students performing research. And part of that, we offer workshops, particularly in the software carpentry, data carpentry. And both of those workshops require installation of software. And when we first got involved in the carpentries eight plus years ago, those installations were not necessarily problematic. But as we move forward in time, people's laptops not necessarily advancing at the same rate, or as university ITs start to lock down the access to having the ability to install software, we were finding that we were losing a lot of our workshop time to trying to just get the work environment set up for them. The Kubernetes and the Nautilus project allowed for us to have those frameworks to where that software that was required for those workshops to be already installed and implemented, and it just required a simple OU authentication ID to log into and have access. And now all of the participants of the workshop, including the instructor, are working from the same platform with the same framework. And then from the workshop, we can concentrate right away on the pedagogy. We also were hearing from researchers that when they've done software installs in their lab, not all of the machines are necessarily kept up to date with the current software. And as we've been hearing in a lot of the talks here at this conference, reproducibility is a topic that's becoming very, very hot. And so with this common framework, what you do for a lab group is that they're working from a common framework, and that means all the libraries are going to be up to date. Everyone's working there. Any changes that occur in that framework change will occur for all the members of the lab. And so they can have a written providence of what they were running at a particular time of research up until when they publish. And that means that all of the people in that group are able to do that. So with reproducibility and people outside the university, again, it provides that common framework where co-pIs can work on projects together, and that access to that same common work environment means that they're doing all their code development, their analytics, their visualizations in the same framework so that when they publish that code that gets developed on that framework runs in that framework. Everyone knows that. And the other nice feature of this is that these containers that we develop are public access-free. So they can actually then incorporate the particular pod that we have describing that, and they can download it and run it on their own machine. So since that becomes a barrier, it's no longer a barrier. We have that tractability. And then finally, the students in education. There is a real gap in what we call kind of like this compute inequality, especially when we're dealing with non-stem fields, where students are coming in with Chromebooks that can't install this software. So there's no way they can make use of some of these analytical tools around data irregardless of domain. If I can't install Python, if I can't install Open Refine, I can't run those tools. This now gives that platform where, again, the valid ID, they log in, they have access to it, and then it also frees up the faculty members. So the faculty member teaching the course isn't having to deal with a class of, say, 50 students, and he's got 50 installs and he doesn't have a TA. They can facilitate jumping right into the pedagogy of what they're trying to teach rather than the installation and working with the software. And it allows for a framework for equitability and grading, right? If everyone is running on this platform, everyone has access to the same libraries. You can't say, well, my laptop has a newer library. That's why my code can't work on your machine type of thing. So it starts to create avenues of making the life of the instructor faster, better, which means they can concentrate more on the pedagogy. So we were getting these as questions coming from our faculty and institutional as we did these consultations. And so Tyler and I started talking about this, and that's when we said, well, hey, James Deaton kept talking about this to us. Maybe we should actually take a look at it. Why not central IT or other groups? We were solving our problems at first, right? And to bring in other shareholders at this point didn't seem to make sense. We wanted to do a proof of concept to see if this really had some traction. And we knew it wasn't going to take a lot of our time to test this out. And now that we've been moving on, we can talk about that in another question. But at the time, it didn't seem pertinent. And it was one of those things where we had the knowledge and we had the time and it was solving our problem. So we did it. I will say that we've got some domain knowledge that other groups on campus did not have. In my former informatics world, we heavily utilized Docker containers. And no one else on campus had that experience. So we could translate that prior experience in containerizing applications that no other group on campus at the time had that knowledge and ability. I'm outside the field, but maybe you can speak on this. In digital scholarship, in applications, in libraries, technology, it sounds like we're really trying to move to like, OK, we're all doing something pretty close, but let's actually combine our forces and have something that has common standards, common tools that we can use seamlessly together. Do you, does that resonate with you? Oh, yeah. Our new associate dean is the idea that we represent the technology, that the technology in itself is not the pedagogy of the domain. And so the commonality of tools is always present. And we saw that when we started to early move into helping with the digital scholarship realm. And then that kind of led to the digital humanities and led to other groups on campus that maybe had never thought about their data in a quantitative sense, right? They worked with their data qualitatively. That it didn't mean that their output, their scholarly output, right? We've got to be careful. Not everything is necessarily considered to be data, but is a binary file of 1 and 0s. That there were some tools that would facilitate with them working with their output in ways that they had never thought of before. But they, in their domain specialties, had never been introduced to it. It wasn't part of their graduate studies. So a lot of researchers do not necessarily have these tools in their toolbox. And so the library in trying to service everyone and consider that no project, big or small domain, doesn't matter. It's this commonality of shared tools really becomes the focus point. And where we can then help with faculty is be able to isolate groups that know technology and get them to talk to the groups that don't know of the technologies. And let the pedagogies be governed by the individuals so that really, we didn't want the technology tool to be the barrier of any education or research that it is seamless across these technology tools. And what then we can do with consulting is help them with their pedagogical roles with the tools and let them drive the ship, so to speak, rather than have the technology drive them. Interesting. So back to the national research platform, where does the project stand now? And how do you see it progressing from here? That is a great question. And I know that they continue to have meetings. Haven't been able to attend them. And I'm trying to think, when was their last? They received. February. Yeah, so they received some additional NSF funding recently. And the University of Nebraska-Lincoln has contributed a lot of compute resources to this platform. So there are over 50 partnering institutions, over 50 partnering institutions that have contributed hardware to this platform. And I don't see it going away anytime soon. We are seeing it being more and more utilized when we go and run our instances. And we are actually talking to a faculty member on campus that is going out and buying hardware to attach to this infrastructure. But anyway, so we've got people on our campus that are looking to invest in hardware to run on this platform. So last question, what advice or suggestions could you offer organizations interested in implementing something similar? So the National Research Platform has a website, nationalresearchplatform.org, that has links to their documentation and how to start up. If you're wanting to just test the waters, they have instructions on their site and on how to sign up with an account. So you can kick the tires without even having to do any configurations or purchase any hardware. Once you want to start experimenting by maybe customizing some environments for your purposes, we share all of our configurations out on GitLab. And the rest of the community does too. And those links are also available through that National Research Platform website. And so anyone can download and follow their instructions on spinning up an environment. One of the things that we heavily utilize in our environment is an application called Jupyter Hub. And that is very well documented and implemented in a lot of places. And there's a group that has also made it easy to deploy into a Kubernetes environment like Nautilus. So we heavily utilize the community in deploying our instance. And anyone else can follow what we've done. We put our configurations out on GitLab. And that lowers that barrier of entry even more, where the few things you need to do is talk to your local IT to at least get a couple of needed items, like some DNS configurations. And if your IT hasn't already set up SSO with CI login, that'll need to be set up. Other than that, running through our configurations, the user can be up and running within a day. I'll add to it that part of once you start to go into their ecosystem and you get authorized and you create your little, they call them namespaces, but you create your namespace to start to work in. They have a chat type community element is what they use. And one of the things that I have found that is an environment that no question is too trivial. And generally, you'll get one of the system people or even someone else from the community who's run into that problem giving you answers and guides are usually in the form of a link to either documentation or their own configuration. And so it's been really a welcoming community overall. People want to see this succeed across all levels. And I think that's part that's going to help with this longevity issue and where it goes because the community starts to begin to see that with these ideas for a small amount of shared resource. And one of the things we did not talk about is that if you do purchase equipment and you work with your local IT group, it sits outside the firewall of the university. It actually gets managed by the folks at NRP so that the system administration and the keeping of the software is managed by another group. You're just responsible for making sure there's power and cooling to that system. And the specifications for the equipment has all been thought out. So there's not a guessing game of what I need to buy to be able to participate on that. So they've really streamlined the entry points to where you contribute what you can and where you feel is worthwhile and join the family, so to speak. And it's been, like I said, a very welcoming environment overall. That's exciting. Is there anything about this that we haven't touched on that you'd like to talk about? I'll mention that people are free to reach out to as we can point them to our configurations. If they run into any problems, well, I'll hold on to that thought. Yeah, yeah. Bite off more than you can chew. Exactly. I will say that the faculty members who have been kind enough to run this in their courses with the understanding of the certain caveats. We don't own the hardware. The network could go out at any time. Bad nodes do occur, and so there are some issues sometimes around the connectivity. It's not a 24-7. It's not designed to be, because it's a pilot within ourselves, but also a pilot for the research community at large, is the fact that it is freed them up. They talk about how it's so much easier to teach within their class that it has this common framework. Also, our appreciation is the student participation in that when we see a class running and we see 25, 30 logins, we know, hey, oh, meteorology is teaching their introduction to programming course right now. And all. And that the feedback that we've been getting has been positive, right? They see it as a bonus. It's been working out well for them. And what we're excited about is what we maybe haven't mentioned is that this cohort that we've been testing with, at least in meteorology, it's a freshman, sophomore level course. And the environments that we create and their storage that gets attached to them is persistent. And so they will be able to use this platform as long as the platform exists. We do not see that going away anytime soon. So tracking them as they get into their upper level courses and start to do more computing around resources and having this resource available to them, and we're kind of interested to see how it follows up. One of the goals is to also, when we talk about, allow for a student or a faculty member and research to kick the tires. And if they decide that they really want this on their local systems, now we can go through the burden of what would it take to install. So it's cutting back on the number of installs and then people don't want it. Now they want you to get it off their machine. It's freed that up. But when they don't have a barrier to the install and they can get right into using the tool, then the install problems don't seem to be as large because they know what the tool can do as opposed to when they're trying to install it. They decide if there's any value in it. Yeah. And then decide if they want to go through the work of getting installed. It's exciting stuff. Well, Mark, Tyler, thank you so much for your time. Thank you. Thank you.