 All right, you probably need to get moving along here We're running a little bit behind time, which means we'll cut in cut into the the break time All right. Thank you All right, so this next session is focused on the ecosystem for research networking Exploring democratize access to research instruments. I'm bar von osen. I'm the director of the Pittsburgh Supercomputing Center, and I'll be talking a little bit about The history of the ERN for those of you that are not familiar with it and then after me will be ferru who'll talk about broadening the reach Which is focused on under-resourced colleges and universities and giving them access of the democratization and Maureen will talk a bit about The actual work that we've been doing to connect up research instruments within the national ecosystem All right, so just to give you an idea in 2017 a group of us met at the National Research Platform meeting in Montana back in 2017, so I'm always amazed when I I see that date because it just feels like it was yesterday, but With COVID in between I've lost a complete Sort of an idea of what time is all about, but you can see it was just Rutgers, so I used to be at Rutgers University before being at Pittsburgh Supercomputing Center Ocean which is a regional network provider in Rhode Island. So that's a research and education Network provider in Kimber, which is a research and education network provider in the state of Pennsylvania So just a small group of people listening to what was being discussed at the National Research Platform meeting and trying to decide if There was something we could do up in the Northeast, so The National Research Platform was based off of something that was going on in California connecting up universities and using what they call data transfer nodes and In the Northeast we realized that just about all the states if you put them together fit in the entire state of California, so a lot of small states a lot of colleges and universities I think we went out there and did a quick check and I believe they're around 2,000 colleges and universities across the entire Northeast, so it's really a lot of colleges and universities So there were some challenges in the Northeast that we had to worry about And so we thought if we could do it there then actually it could be replicated across the country So anyway, it was just a few of us. We had some ideas and said, okay, how do we do this so in 20 and January of 2018? Group of people came to Rutgers University sat down and thought about okay If we're gonna do this what would it look like and you can see that the list is growing a little bit longer We've Added the Massachusetts Green High Performance Computing Center internet to and Nizernet, which is the regional network provider for the state of New York and We decided that we needed something that put pull us all together and and sort of Get us discussing how we might do this And so we came up with an idea of a federated proof of concept that was focused around research computing And so several sites said yeah, this is kind of cool. Let's do it So we repurposed some older equipment. We connected it up over the networks And so this is where the the regional network providers came in handy We looked at the way we could move data around give people access and you can see that The list grew even longer. So we found that you know if we put a stake in the ground and say hey Let's do this all of a sudden other people will say yeah, that's kind of cool. Let me try it, too so Syracuse came in Edge Google Maine Jumped in and helped think about What this might look like and as it says it was an early if not elegant approach but the idea is that it was getting us talking and thinking about what we could do as a community and Through this again because the network was really important if you're doing federated services There's a tool called perf sonar. It allows you to sort of understand the connections be all between all the different sites and So we set that up so that we could see where the Some of the pain points would be if we were going to Let people access different data sets that were distributed around the region And then and then as we started doing more and more of this the interest started growing and Got us thinking. Oh, you know, we're actually being successful in this Maybe we should formalize it somehow and so Originally the ERN was the eastern regional network and network was used as In a network of people and network of universities. So it was a really a consortium of people in universities and And as you can see all of the above plus Delaware New Jersey Institute of Technology Buffalo Bucknell. So the list kept getting longer and longer and longer as we started doing this So anyway, we got really excited about what this might look like as we started building it out so anyway, we We ended up formalizing it. We came up with a vision and a mission If you look closely in the in the blue areas, you know, you don't read the whole thing But really it's around supporting and enabling collaborative data and computational-enabled science building out standards blueprints policies and training and then the last bit is around democratizing access to research instruments. So And so we pivoted from looking at research computing to start looking at research instruments in general So that was a mind shift and that was based on the fact that we got an NSF campus cyber infrastructure planning grant that Actually when we were having one of our all-hands meeting It was announced that we got it and this money allowed us to start building out working groups and really focusing on different types of research instruments and So if you look at the list of working groups, we were you know thinking about materials discoveries structural biology And then what would be the architecture Federation computer science component of it? Policies we wanted to make sure that we had the policies in place before we started sharing services and then broadening the reach which Faroo will talk a bit about the architecture and Federation Maureen will be talking about But the idea is you know, how do we start supporting research across the country and And and you know giving people access to these research instruments, so Name change what happened was that we had more universities across the entire u.s. Start to Reach out to us and get involved and so we changed from eastern to Ecosystem we wanted to keep the ern acronym So we came up with ecosystem for research networking everyone just calls us ern now So I think most people don't even remember what it stands for but anyway, that's what it does stand for And you might wonder why we focused on materials discovery and structural biology the reason is that After having conversations with lots of researchers, we realized that if we could solve the problems For materials discovery and structural biology with connecting up research instruments We would actually be able to do much broader connections to other instruments as well And we've we've tested it out with different types of instruments. We found that easily could swap things in and out So so that was really the only reason We did submit another grant and one of the complaints was it was too many Science strivers which you'd never hear from NSF, but supposedly we had too many science strivers And they said narrow it down so we narrowed it down to Materials discovery and structural biology and then everything else was broader impact, right so But anyway So just some of the activities we've been involved in so of course the working groups. We've been running workshops. We've been You know offering recommendations on data standards Architectural blueprints and policies you'll see some of that later in this talk and you know data management and other things and We we again are interested in getting people involved with us So we put a pretty low bar and membership. We just asked that you like what we say in our vision and mission statement and And to give you I'm not gonna ask you to read through all that but you know, it's it's a lot of different universities So it's small large, you know our one our twos You know small liberal arts colleges have been involved. We've had you know different organizations industry partners research and educational networks and funding agencies working with us and thinking about what this might look like and so The next slide is actually going to be jumping over to broadening the reach so I'm gonna hand that over to Faroo Thank You bar I'm Faroo Garamani assistant vice president for research and innovation at NJ edge And J edge is a regional research and education network and the work that I do at and we refer to it as edge at edge is very synergistic with My role on the ERN broadening the reach as co-lead with John Hicks and It's a pleasure to be able to share some of the work that we've been working on and we look forward to hearing from you on how the ERN Can best serve the needs of your communities? so the mission of broadening the reach is to understand the needs of the diverse Institutions the emerging non-r1s small medium-sized institutions including the MSI's So that the ERN can have the broadest impact across multiple research disciplines We identified very early on that workforce Development is an important aspect of the work that we do with the ERN Preparing the next generation through integration of research and education and Developing the computational expertise and CI professionals at all levels and then a Very important aspect of the ERN and ERM broadening the reach has been building the community and developing strategies to reduce barriers to and improve opportunities for resource sharing and collaboration the working group like the other working groups meet monthly and We have surveyed the community to understand What are some important topic areas that we should be focusing on and we've listed the topic areas and you know at The very top is access access to funding access to expertise access to resources for education and and access to the communities that are important as bar mentioned, we've had a series of workshops and educational seminars and to understand the needs but also to raise awareness to the resources that are available and the funding opportunities that are available and to foster Collaboration among the community we've shared the findings in publications and conferences such as this and We're we've made recommendations to institutions based on current practices As well as some recommendations to the ERN on how the ERN can serve the community and and some suggestions to the funding agencies and most importantly as There have been many significant partnerships that have been formed as as far indicated as well as a community that has come together of thought leaders focused on collaboration and resource sharing so some of the findings and It's a work in progress, but there are many factors that are involved in reducing the barriers and As I mentioned earlier funding is an important aspect. There are limited funds for expertise and resources multiple sources of funding are required both from the institutional as from the institution as well as from funding agencies and and Models that are sustainable beyond when funding sources are exhausted at the federal level are necessary Infrastructure compute storage resource Instruments Access to the cloud There are existing infrastructure resources that may not be appropriate or insufficient for the work that's being Performed for research and leveraging the cloud is both a challenge and and a An opportunity and while there's knowledge that There are benefits to leveraging the cloud. There's also challenges associated with it focused on both adopting as well as navigating the cloud and Training is another big challenge The cloud relationship. It's a business relationship. So vendor management cost management and training are at the top of the list for challenges Many of the challenges for the cloud are similar Specifically though cyber security management procurement and integration complexity, especially when multiple clouds are involved I mentioned expertise It's really hard to find the expertise to support the needs of the research community and support the resources that are being Required in many cases. There's cultural campus transformation. That's required Teaching is a top priority as many of these institutions and therefore there are limited resources and time for research building relationships across campus and active information sharing very specifically Bridging that gap a communication gap between enterprise IT and the research community is a an important aspect and gaining leadership support having leadership understand the value of having resources available and Also the fact that having access to these resources serve as recruitment tools for faculty as well as for students There's a need for standards and guidelines and standard practices and policies and striking that balance in policy between IT and research especially in areas such as security policies identity access management and authorization and using the cloud Communities access to these communities are just as important as access to the technological technological resources and We will continue to Help understand the needs of these communities through workshops and seminars and outreach activities We have an upcoming Summit That's April 11th and 12th. It will be in Pittsburgh most likely at Carnegie Mellon University and We have partnered with the construction of liberal arts colleges for that summit and we welcome participation and then last but not least We're We want to identify some funding resources to support student internships and I'm going to invite Maureen Daugherty who will review the cryo em project update Thank you for my name is Maureen Daugherty I'm the program coordinator for earn and I'm here to talk about our project cryo em instrument pilot project and this was This doesn't like me. It is not right this way There we go Okay technology so What we were trying to do with this project is to address some of the concerns that were brought out in our outreach activities that for who just spoke about and we're trying to provide remote access to instrumentation and the edge of campuses and typically these are secured behind firewalls a lot of different infrastructure restrictions Security policies things that don't allow an easy access particularly for the non our ones and the underrepresented under resourced institutions so what we did is Reviewed these barriers and challenges infrastructure whether it's there or not Security want to make sure that it's not so secure that it's difficult to use and causes people to do things that we don't want them to do policies that prevent People have access to these whether it's local institution or the labs policies the complications of authentication authorization access throughout the workflow Like I mentioned ease of use Accounting these could be something that are used as a fee for service So the accounting aspect might be difficult for these non our ones to address and then knowledge expertise in education Just having the knowledge in order to create these pathways in a secure manner and actual people to be able to use it in the proper optimal way can be very challenging for some of these groups So our objectives we want to make this easy secure a web-based portal With very simplified federated authentication authorization and access we want to leverage the existing policies and procedures of the Institution we want to augment and not replace it needs to be easy to use We want to be able to create real-time workflows with real parameter adjustments leveraging edge computing We want to be able to leverage outside resources for advanced analytics And we want to ensure that we have a secure Data management system throughout the pathway and this is particular for the large amounts of streaming data that we're getting And then we don't want to reinvent the wheel. We want to leverage what's out there. We are a small group So leveraging what someone else is doing is going to help us progress further And then we're sharing our resources and our efforts with the community And we have provided a github link and it's something that you can check on our website as well So what we came up with is this particular project design and what we do on the left side is the Actual resource. This is the edge instrument for cryo. We am this is the transmission electron microscope and What we've introduced is what we call the urn open ci cloud lit and this has two components It's the edge compute resource and this is GPU and some storage and this allows us to do pre-image Processing so that we have reduced Latency so that we can do that real-time parameter adjustment we talked about And then we have the instrument portal and this is the web-based access for the remote user to come in and use these various resources In an easy fashion and so it's not complicated and anyone can do it So this was the original design and we started with phase one. Can we do this manually? We can't do it manually We can't do it at all. So we partnered with Rutgers cryo in the nano imaging facility Dr. Jason Kielber and they have a transmission electron microscope from thermal Fisher scientific and a K2 camera and What you see here is the exact configuration? The edge resource has two systems that are ahead of the microscope And those are proprietary systems the camera then feeds into a camera server, which is also proprietary We then mount that Camera file system read only into eventually our cloud lit So this is the setup the cloud lit will then do the pre-image Processing Leveraging a software application called cryo spark. This is a standard cryo em application Another one is also called rely on and we use the cryo spark to do our workflow management data management and monitoring So our science researcher Logs in to Rutgers VPN. This is totally Rutgers individuals and we have configured the pathway leveraging proprietary excuse me private VLANs creating access lists that Restrict IP addresses and service ports and it makes it completely private and secure Our researcher is able to then log into the cloud lit Launched his cryo spark application for the workflow management aspect. He then is able to access the edge Access bridge system, which then takes him to the microscope Modifies his Parameters there and he's able to then go back into cryo spark and run his process So this was able to be done manually everything worked. He ran Tulsi protein Structure analysis two days two point five terabytes worth of data and everything was fine. So we're very excited We can do it manually. So let's automate so the automation We started with open-on-demand as our instrument portal and we chose that because it's a well-received and open source resource and a community supported and we worked with both open-on-demand and Thermal Fisher trans Thermal-official scientific the vendor to figure out the best way of accessing the microscope We did try various things. We want to keep in mind that while this is particularly for cryo em We want to make it available for a broader community We want to make sure that our solution is not tied just to this particular scientific instrument for cryo em But for other types of microscopes and then beyond that other science domains So the workflow was developed. We leveraged An outside expert because our group didn't have that expertise that we talked about as a challenge to our researchers It was also a challenge for us that expert was able to come in and leveraged his own private test bed and built a Container eyes to open-on-demand using podman and build up and we were able to Run that on our cloudlet and I'm really cutting over you're glossing over a lot of really crazy stuff that we encountered In the test bed everything worked fine and lovely. He created a Parametrized Configuration file to work his scripts so that when we went off the test bed, which was running a particular operating system He was using globus and a flat file for his authentication our project art Production area was using CI login and a local instance of LDAP It worked fine in test been Obviously we had some everything worked fine in production Not so we had a lot of things we had to work out and eventually we did and We were able to Launch his process again, and he was able to do a workflow. So it did take a lot of work and coordination and Collaboration with the vendor as well as with the other institutions open-on-demand. So phase two Now that we've got the basics there We want to be able to leverage a few other things that we have this part of Rutgers and that was a fabric node Rutgers was in the process of standing up that fabric node and what we were doing We were going to create a slice create our own fabric cloudlet and replicate what we did in the earn instrument cloudlet Unfortunately Rutgers node lit with a fabric node was not verified by the time we started this project So instead of using the Rutgers cloudlet, excuse me the Rutgers Fabric node. We actually used one of fabric's network center nodes So in Washington DC We created a slice using a fabric facility port and Leveraging our existing infrastructure in Rutgers created an environment where we could then Access the instrument portal on the fabric node here in Washington DC From New Jersey which mounted and was able to process the microscope and the camera Streaming data from Rutgers and we ran this Local in Washington DC, but we were monitoring and in accessing the microscope from Minnesota during the fabrics K2 K7, excuse me knit 7 Members meeting so From Minnesota to Washington to Jersey it all worked and we were very excited that that was something that we were able to do so we're now in the process of Building that on the Rutgers fabric node now that that has been verified. The other thing we're doing is well at Rutgers we were using amral cluster for their additional analysis the workflow goes From the camera to the edge resource pre-image processing that gets outputted to a Read on to a file system off amral neon ethicist is done there. So instead of using amral, which is Rutgers cluster we approached Pittsburgh Supercomputer and We were able to work with them and build a workflow that leveraged bridges to their cluster so it went from Rutgers instrument cloud lit access the microscope data was transmitted from the microscope to the edge computing on Rutgers the earned cloud lit Pre-image processing was done there and then the additional analysis data was written out to bridges to and the software was The analysis was done there local and our output came out and we haven't had a chance to Do any analysis regarding latency? We did not do a full test like we did with the manual process where he ran a full simulation So that still has to be done to ensure that we have that real-time parameter adjustment But it was working now our next project is to incorporate a true data management and workflow management system Like Pegasus right now we're leveraging cryo spark to do that work And that's very limited because that means that we have to use that particular software application And what we really want to make sure is that we're not limiting ourselves. We're providing a Option that allows as robust system as possible. So that's why we're looking at Pegasus So this is the efforts after a year and a half of Our first phase one. This is start by going to the splash page This site is going to be a portal for all of the electron microscopy resources So I'm going to log in using my school's federated identity provider This is the CI login that we talked about I'm gonna get a duo 2fa push here And now I'm at the splash page of the portal So I can access instruments or do analysis Through these tools and I'll access it through this bottom here So now I'm connected to the electron microscope and what I'd like to do with this electron microscope I think I'm going to acquire some data So I've already started a session of analysis We're gonna connect to that now and I've taken a few pictures already and I'll just show you an element of the configuration First we've divided up our compute so that pre-processing is done locally but the cluster is being used for the main processing after data reduction and There is a watch folder on the instrument Through the cross-mounted file system that we're accessing to get these images Okay, now the images that I just started to Collect are coming off and so we can take a look at some of these images and see how they are and how they are is pretty bad What we can see here Is these lines indicate that the water has formed ice crystals and we don't want that so We're gonna go back to the microscope They're remotely control the microscope We're gonna leave off this point and collect somewhere else on the grid so Image 75 we've just collected now. We're going somewhere new and image 76 is going to be at the next site and What we should see is that By taking the interactive feedback from the live processing of the data We were then able to Redirect the microscope to a more appropriate target and make better use of our instrument time So now that frames have been saved the image will almost immediately appear in the analysis window It's now in progress as the GPU acts on it and the a1,000 GPU has finished the frame alignment And now we see that we have a much better image quality That is giving us the type of data that we want to have this demo is complete of a logout Now it took us about a year and a half to get this far if you recall One of the icons he actually had two microscopes shown and that was because we were doing a demo at SC 22 And we couldn't get the microscope revert reserve for the demo there And well like I said a year and a half for that it took us one week from saying Maybe we can use the ion burner and for them to have that available and that's because of the way it was Consciously decided to make something robust Flexible and dynamic so that we could add those other instruments that easily So in conclusion, this is working. We are able to make those real-time adjustments. We are leveraging the edge computing We have reduced some of the IO Dr. Caleb in his lab currently are the only ones who can access this right now We still have work to do but they are using this in their production lessons learn security is still an issue and Expertise is key We had to reach out because we didn't have the expertise which means there are other groups out there that would need it as well So by providing this that will make it easier for others again. We don't own this we share it we provide it to our researchers and We have a number of groups that are interested the First the the group on the left are the ones that are participating We have interested parties in the middle and then we have our access sites that we're enabling which similar to the like the PSC's bridges to These are the groups of people that are working with us all of the earned groups open on demand Fabric and Pittsburgh supercomputing and pedigasus team. So if you want more information, please go visit our website Earn our P.org. I think we're also earn our P.ci now earn.ci earn.ci We are over time But if no one kicks us out, maybe we can have some questions Yes, sir So the the way I see it is that this is where the training is going to be really key Accessing it remotely Will be important because a lot of times they can't travel to the site and so this is where Partnering with either the university or if they have their own samples have it shipped ahead of time And so have somebody in the lab Actually loading the samples beforehand and so so we are tying this in directly with all of the core services that universities have like For example, iLabs so you can reserve the time on it and all of that piece, but Before you can get there. You just have to go through some of the training. They could just you know have somebody from that site work very closely with that class and educate them on it so That is an important component of it So that was really great and fascinating and as someone who does a lot of streaming data infrastructure development I'm really interested marine and you mentioned that the data is streaming, but is it What someone who really works on streams would think of like uber streams and patchy Kafka or is it like globus connectivity? From site to site and moving on automated fashion So we're leveraging cryo spark to do the data transfer main transfers We are and then we have the file systems that are being cross mounted so in the local one to Rutgers the file systems are all mounted into the cloud lit and Cryo spark is Reading from one monitored file system and writing to another and then launching those jobs for Amral When and because we are mounting in from bridges into the cloud lit it does the same there So eventually we would have to find another solution because that's only tied to both like cryo em Excuse me cryo spark and rely on and that's why we're looking at Pegasus or other data management or workflow management systems So I'll talk to you afterwards about some streaming systems that we've built that I think might be have a role like a partial role here Yeah, that would be great be interesting And I think if I can also respond to someone who's done a lot of TEM work I think that I actually think it's great that you've worked on something That's really hard to do and has a lot of other pieces because then other things will be easier But I also think for actual applications things like our u students We have our u students who come for the summer and they go back to their institution usually not our one institutions Because the way we select them and they won't have access to come back to the lab But they have a project that can continue and they have collaborators there and now they can continue to use an instrument Obviously the latency still gives a lot of difficulty in and they're not going to run Super easily and get the best data, but they can do what they came to do for their projects and continue that So I think it's pretty cool. Yeah, the challenges is the connection from the smaller college or university to the national resources So we work with the bridges to teamed because and fabric so fabric is a national Network test bed and it's built on a terabit backbone. And so Very very fast. And so every national center is has a fabric rack there so we can leverage that But it's getting from the campus to that Yeah, to that fabric infrastructure that's always the challenge, right? Yeah, so sure. Thank you. Anyone else? I think everybody went for a break. Okay, great. Thank you so much