 I think it's time to get started. Welcome everybody. I'm Cliff Lynch. I'm the director of the Coalition for Networked Information and I welcome you to this breakout session which is part of CNI's Spring 2020 virtual meeting. Now in the beginning of its second month it will run till the end of May. We'll be hearing from our presenter today. We will take questions at the end. Diane Goldenberg-Hart will materialize and moderate the Q&A. There is a Q&A button down at the bottom of your screen and you can use that to queue up questions at any point during the presentation. I urge you to ask questions as they occur to you and that will save us time at the end when we go to answer them. Topic today is data science support and how to build capacity through graduate fellowship programs. This is a very timely issue certainly in our recent work on emerging technologies that will have high impact on research libraries in the next few years that we've been doing jointly with ARL and EDUCAUSE. Data science has been identified as one of the top priorities. It's really, I'm a little uneasy calling it a technology because it's really a group of related technologies and skills but how to build sufficient capacity for it is a major challenge at all of our institutions and I'm delighted to have Jeffrey Oliver from the University of Arizona who will be presenting on behalf of the team who you can see listed on this slide here who will talk about their experiences and strategies in this area. Thank you so much for doing this Jeff and without further ado over to you. Thanks Cliff thanks for that introduction and thanks to all of you who who are here today. So the first thing I want to start with is to recognize the collaborators that I'm working with on this on this program. So this has largely been a collaboration between myself and Dr. Vignesh Subin in the College of Engineering to create a program for graduate students to support data science at the University of Arizona. It's further supported by the University of Arizona Data Science Institute first with Susan Miller who retired on Friday and now by Malika Oxham. So without these folks this work just would not have been possible. When I usually talk about data science, usually the first thing we have to do is really to define the term data science. So if you look on Google these are the top suggested questions that folks are asking and I think they're pretty telling because they're really asking questions about what this field is and you know how to explain it in terms that a normal person can actually understand and so that's what I want to start with is just what do I mean when I say data science and try to do it without a lot of jargon. So I look at data science and this is one definition not the only definition but one which is really this overlap of three fields and the overlap of computer science really to use advanced mathematics and statistics and apply those technologies to some area of interest some domain such as biology or astronomy or linguistics and really to address questions in those areas and data science also includes really critical skills around communication and ethics that is sometimes overlooked but really data science for the purposes of this talk is going to be the overlap of these three areas. Now I think that as Cliff mentioned in the introduction there's a really a consistent growing interest in data science going back to Google if you look at the searches for the phrase what is data science for the past 10 years you could see a steady climb upward in in that search and just uh I want to take 10 seconds and have you look at that graph and notice that every year there's a spike so what is the spike and although I can't say for sure one of the things that it does correspond with is that's the start of the academic school year so I suspect that when folks are looking for majors when they come to college they type in and what is data science so that's kind of a kind of a fun fun look at the data and when I consider this definition of data science as being the overlap of computer science mathematics and subject matter very frequently that third element is is not discussed much so really the subject matter expertise is super important to the field of data science and to being able to making meaningful inferences using those approaches from mathematics and statistics in some domain very frequently we talk about machine learning or neural networks or some part of data science where we sort of forget the subject matter as expertise so the challenge of course is that that there are a lot of domains across the college campus these are just some of the examples of the way data science is being used across domain we have it being used in uh in the the field of of uh english just looking at actually who wrote texts such as some of english uh some of uh shakespeare's works we also see it playing a critical role in public health and being able to provide sort of uh on-demand or at the moment uh help and then finally we're seeing a lot of data science applications in the field of biology looking at how how organisms are responding to the changes in seasons using data that are just too large to be analyzed through traditional approaches I want to give a shout out at this point too to subject matter expertise uh earlier today the team at UNC gave a really cool talk about using machine learning to identify jim crow laws so really a prime example of how the libraries are engaging in the data science realm so if you didn't get a chance to see their talk live earlier today I would highly recommend going back to watch the recording because they did a really good job so and really showing how libraries are going to make a meaningful contribution in data science so if we think about data science as being a very diverse realm we can't really hire a subject matter uh expert in every domain in the libraries it would be nice if those resources were available but they're not so how can we still support data science across campuses and so if we think about supporting data science what do we need to actually do that so one thing we need is we need experience so we need folks who have experience in applying modern computational and statistical approaches to address these domain level questions to address questions in astronomy questions in biology questions in linguistics and then we also need folks who have an interest in data science education in helping others level up and gain access to the knowledge skills and abilities so that they can use data science approaches and I should say that while pandas is an important python package for data science this panda is actually here to remind us that uh we would normally be in san diego this is from two years ago from the cni meeting in 2018 it's a panda from the san diego zoo so a little a little bit of san diego coming coming to you all even though we couldn't uh be at san diego and so these needs of expertise and interest in data science education there is considerable capacity for both of those in the graduate student population and so there are several graduate students who actually have the in the trenches experience of learning how to do these these new computational approaches and applying it to questions in their own domain and then there's also an interest in actually helping others to acquire similar similar approaches and similar skills to be able to do um the types of data science analyses uh in their own in their own uh field and so what we did is we created a program called the data science ambassadors fellowships and what this does this actually capitalizes on the knowledge that is present in the graduate student population and it provides infrastructure so that they can support researchers in their data sciences needs so what is the program actually what does the program actually do so the first thing we do is we uh provide training to all those who are part of the data science ambassadors program and we start with actually the carpentries instructor training this is a two-day program that happens uh it actually happens online for the most part but it's a two-day program where everyone in the training receives uh received instruction on good pedagogy especially for teaching computational skills on the importance of motivation in creating a positive learning environment and then also recognizing some important parts of the learning environment such as imposter syndrome the stereotype threat and and the idea of a of a growth versus a fixed mindset so this is a really really great training program and I would say that even if you never become a carpentries instructor it's a super useful program and actually all of the materials are available online for anyone to use so if you even if you never teach a carpentries workshop but are doing instruction in the realm of computational literacy I highly recommend this this training so this is the first thing that we ask our instructor our ambassadors to to go through is this training the next thing we do is we work with them to develop a roadmap of expectations so each of the ambassadors creates what we call an engagement plan where they lay out in each of the two semesters in which they'll be an ambassador they lay out plans for how exactly they are going to serve the researchers in their particular domain it's going to be tailored to the constituents that they serve so really the scholars that are in their respective colleges as well as their own expertise what they feel comfortable providing support for and what areas they may may look to collaborators to help out and on these engagement plans it's an iterative process where the ambassador's draft and engagement plan the directors myself and and big Nash provide feedback and we we usually go through a couple of rounds of the engagement plans till we get to a final draft so one of the challenges with with data science resources and you probably see this on on your own respective campuses as well is that there are a lot of data science resources across campus and there's really no one place that has all of the resources and so what we've done is is we started with a with a relatively slim document a data science resource guide and then we asked the ambassadors to curate this document that has all of the resources across campus that they can either use themselves or direct their respective colleges and to find support so this this resource guide includes various service units as well as people who are working in specific data science areas and it includes things like where they can get support for high performance computing for statistical consulting cloud computing resources and the ambassadors have actually done a considerable job of contributing to this guide so it's it sort of removes the burden of everybody going and searching the the university's website to find resource x we now have a single guide that they can use to navigate now the ambassadors aren't doing all of this for free there is a stipend that we provide there's a lot of professional development that we're providing to the to the ambassadors but we also recognize that labor is important and that graded students should be compensated for their their labor and so we provide a small stipend and it's not meant to replace a graduate assistantship or a ta ship really it's just to support the a little bit of the the efforts that they're doing and this is one thing that that you know we're mindful of is that we're really only asking for about 15 to 30 hours of their time each semester so we we are really careful and mindful that grad students have a lot of work to do and that they are compensated for it and so with the support from the the University of Arizona Data Science Institute they receive a a small stipend for for their efforts but this is as I mentioned before like we don't anticipate that this should take up all of their time and it really should be taking up a minority of their time and so how do we do this one of the things that we really want to avoid is exploitation of graduate students so there was one example where a graduate student of one of the Data Science Ambassadors somebody asked for help and really at the end they were asking for the ambassador to do their homework for him and so I provided a little bit of guidance but actually the rest of the Data Science Ambassador cohort did a really good job of how to how to handle this and how to actually you know it's okay to say no and we use slack for communications among the ambassadors and the community did a really good job of supporting the ambassador with sort of guarding their own time and providing an appropriate level of support another thing we've faced not from uh not from the ambassadors and not from students but from uh it's and this is a very minority reaction is uh some of the faculty or some professors are not really uh excited about recommending this to their students so so the question is you know why would I tell my student to do this um and so you know we we reiterate the benefits not only to the students so there's a lot of networking opportunities there are great skills development but there's also benefits to the to the research group so they can bring back best practices and provide training to the research um you know for some people this is uh this is competition with data collection or analysis that their graduate students quote unquote should be doing and there's only so much that we can do um some folks are are not interested in that and that's okay uh we still have enough uh we have enough interest from from the college uh without worrying too much about that and so one of the things that we did uh changes that we've made recently is the first year we actually had a uh a self-nomination process uh in the second year we switched to uh have colleges actually nominate the candidates and so one of the things that this does is is it provides uh sort of buy-in from the colleges it gives them agency because a lot of the colleges the first year were quite sure what the ambassadors were supposed to do or what they were for and so by going through the colleges and asking the colleges to nominate the the students uh that gives them more agency and they have a little bit better idea of what uh role the ambassadors could serve the college because we are asking the nominating units the colleges in this case uh for for a stipend it makes sense that they would have uh the choices to who to nominate and actually we also provide them the opportunity to nominate more than one ambassador if they have the if they have the the support needs to do so if we compare just the first uh just look at the first two years uh there's been growth in the in interest in the in the program we've actually from year one to year two we doubled the number of colleges that there are ambassadors in and uh the interest like I said just seems to go and I just take a quick glance this is not an exhaustive list of the sorts of things that the ambassadors have been doing but they're providing a lot of consultations to scholars in their respective domains they've been either instructors or helpers at four software carpentry workshops here at the University of Arizona and they've also been standing up their own standalone workshops doing workshops in things like common scripting languages such as r and python orange is a really nice uh graphical user interface for for for data science applications especially machine learning as well as work in in geographic image information systems and it's really uh it's really cool one of the things that doesn't show up in this slide but I want to mention is that uh this latest round of cohort we have an one ambassador to the College of Humanities and one ambassador to the College of Education and these are realms that you may not immediately think of for data science but it's really cool to see this ambassadorship program really reaching in and engaging with domains across the campus that are interested in in data science in different areas or at different levels of maturity in data science and that's one of the things that's great about the ambassadors program is that they're able to assess where folks need help at what point in the sort of data science maturity um cycle that they need help these are some of the specific um works that they've done really surveying the landscape of needs for data science uh as I said we had a college we have an ambassador to the College of Humanities and really introducing data science I think the topic of the workshop that they ran was something like what is humanities data and what do we do with it that was really good another standalone workshop was the the introducing machine learning with a great a great python tool called PyTorch and they've also gone beyond just doing individual one-off efforts the first cohort of data science ambassadors actually were instrumental in launching the University of Arizona inaugural women in data science event in 2019 there were over a hundred folks who participated and it was a great event we've done it again this year but it was a data science ambassador to really provide impetus to to get that started the data science ambassadors in the second cohort are currently leading the organizational efforts for our 2020 research bazaar and they've also been in a variety of other involvement in a variety of other efforts including the Our Ladies one of our data science ambassadors actually the local organizers for Our Lady workshops and we also have another ambassador who's working quite a bit with the Code for Venezuela project so that's the first two years and the third cohort actually the deadline for applications was was on Friday and so so it's 2020 and we're in the middle of a pandemic and in the face of the economic crisis that's coming there's concern about providing stipends for for graduate students and what sorts of resources are going to be available so I will be honest I was a bit apprehensive about about a response that we might get this year and and how it might diminish because of the the current situation but as it turns out it was probably not something I needed to worry about I think this is quite telling that despite the challenges that we're facing right now the interest in the program really remains strong is that we had another 13 nominations from five colleges and it's cool because they're some of these colleges college of science the college of ag and life sciences they're clamoring for more data science ambassadors they've actually nominated multiple multiple graduate students to play a role in the in the data science ambassador program so with that I just want to close once more by thanking the folks who I've collaborated with these slides that I have presented here are all available on google drive with the link and and the video is going to be posted later on I'm sure and I want to again thank especially the Gnash Sabine who's the co-director of the the data science ambassador program it was really through through working with him that we really launched this at this point I'll stop and I will address the questions that you have and I'm pretty sure that's going to be moderated by Diane yes it will thank you Jeff thanks thanks for your great talk that was really interesting and fascinating to see I was particularly interested to hear that you have an ambassador in the College of Humanities and I was curious to know if that was a targeted recruit or if that was a spontaneously occurred on its own or is that something that you guys are looking to balance your portfolio how did that happen yeah so so the first so so the it wasn't necessarily targeted we we didn't reach out to we sort of put the call out to all college deans to see to to gauge interest and it really is but but there is a there is a intentional diversification on the part of the directors to engage with lots of colleges so one of the things we we did not design a program to support a bunch of graduate students in the computer science department that that's not the purpose of the data science ambassadors program and so really it's a matter of trying to engage with as many colleges as possible across the campus so yeah good good question interesting all right thank you for answering that and sorry that I jumped in line everybody but I just couldn't resist asking and with that I just want to introduce myself I'm Diane Goldenberg Hart with CNI and welcome to Jeff Oliver and thank you for giving that great talk and welcome to all of our attendees thank you so much for joining us as part of CNI's spring 2020 virtual meeting we're really grateful to you for making time out of your day to join us and with that I will invite our attendees to please go ahead and type in any questions you might have in the q&a and I will read them aloud so Jeff can answer them live or if you prefer type them into chat and I will read those too uh questions comments anything you'd like to say and I see that we actually do have a question already that comes to us from Amanda Henley who was one of the presenters in our last webinar this afternoon uh Amanda writes this looks like an amazing program I wonder about the hours it seems like the need for consultations would easily fill 15 hours have you had trouble with students running out of time we have not um and again part of this is we try to remind students to be mindful of their time and to remember that their primary goal should be to go ahead and get a degree while they're here um but uh I I mean there's only so much I can do to tell students don't spend too much time on this stuff uh and I think they probably all go over and I think to some degree the way I look at it is we really we really make sure that they are not spending too much time on them and really tell them to back off if if they are um but they are I think when they go beyond the 15 30 hours they're getting experience that I think is making them more competitive for whatever comes next in their careers it's something that uh we're trying to assess just the impact one of the things that we do at the end of each cohort is we have a self-assessment and so we're trying to get feedback on how much time are they spending is this something that's that's too much um but it really does seem like they are none of them are none of them feel too pressed for time for the ambassadorship but it's something that I actually want to spend a little bit more time to focus on so that's a great question Amanda thank you for asking that yeah that was a great question thanks thanks Amanda thanks for the answer Jeff and we have another question that came in from Jill Sexton who asks are students largely providing consulting services to their peers to researchers or both and I'll take the liberty of just adding to that that we uh Cliff um ran a series of roundtables in April talking about research continuity and one of the things that we heard in there was that a lot of faculty members are realizing the gaps in their knowledge and how much they've been relying on their students and are seeking training now as a result of the current crisis so I'll sort I'll throw that in there as well to appendage to Jill's question yeah great great question Jill it's great great add on there Diane the answer is both so they are both providing support for their peers and that may be the majority of the the support that they're providing given uh given one the numbers and and two sort of who's doing the work so a lot of times it's graduate students who are who are protest with um data analyses and and and PIs a little bit less so but um I would say that it is mostly peers but uh they are very frequently going beyond uh their own I would say department and extending support uh more towards a college level um because even though a graduate student in linguistics uh may not necessarily know sociologist that much about sociology they still are close enough uh discipline wise that it's easier for them to talk about applications and challenges than it may be for for me who's a biologist to to address those things so I would say they're providing support for both peers uh as well as as PIs um although in the in the case of the PIs it may be sort of an issue of really filling in those gaps that Diane was talking about earlier good question Jill interesting thank you and another question from uh Nancy Hobel Heinrich she asks how are the students deciding what kind of workshops are offered e.g. the carpentries workshop that I think you mentioned are those for specific colleges or cross colleges yeah so the so the answer to that is also probably both uh the carpentries workshops those are organized uh outside of this the data science ambassadors program we have a unit on campus that that organizes and coordinates uh the carpentries workshops software carpentry workshops that we run uh our domain independent so those are those cut across all domains um we have run a couple of data carpentry workshops data carpentry for image processing and data carpentry for uh for GIS uh data and analyses and those uh those the the data science ambassadors who who have expertise in those domains those are the ones that they help out as far as the individual workshops that they are that they are providing that's something that we actually uh work with them to identify appropriate sized audiences so we're we're we're a little bit less keen to have ambassadors run a workshop where the only attendees may be their lab group because then then you don't have quite the reach that you would like to have with the ambassadorship program so what we do is we try to provide uh enough guidance so that they can create a a workshop and a space that will be accessible to uh at least a departmental if not a college level audience so good good question it's it's a little tricky right every situation requires a little bit of different attention but it's um I think it ends up being uh worth the effort to try to identify appropriate audiences yeah okay interesting thank you for the question Nancy thank you for addressing that Jeff and we have another question from Amanda now um Amanda asks are the ambassadors given space in the library to work and where are they located on the org chart oh so uh so that's a good question so the ambassadors are for the most part not provided space in the library um the library the university of Arizona library is is in the middle of towards the end of uh some uh significant construction with the with the library space and so ambassadors are not given explicit uh workspaces they ideally as a graduate student you have that provided by your lab um although the library does provide that for for graduate students who do not have that um but I will say that if ambassadors need space for consultation or they need space to run a workshop that is something that we work with them to provide space in the library the library just opened up a new a new area it's called the catalyst studios it includes both our our sort of our data science area as well as a brand new maker space as well a variety of consultation rooms and the ambassadors have been making quite a bit of use of that area to provide support in data science but one of the things that I think the ambassadors are really good at too is also meeting scholars where they're at for meeting in in whatever college or department uh the folks who who need the support are yeah it would be yeah it's interesting to think more about that it's about how do we provide that space so that they can provide more support that's a good question and I'll have to think more about that one we just heard about the catalyst studios um last week great yeah project briefing and c&i was fascinating and looks like a really exciting space so it's super exciting it's so nice to have that space yep it's terrific okay thanks thanks for that Amanda uh and from Cliff Lynch we have a question Cliff asks what's the duration of an ambassadorship and can they be renewed excellent question so so we're we're um sort of thinking about that right now what that looks like so the duration of an ambassadorship is a one academic year start in the fall and you go through spring but one of the things that we sort of miss out on when we do that is we miss out on continuity to some degree so each cohort is independent and each cohort sort of starts starts anew but I think what we're going to do this fall is we may experiment with a couple of senior ambassadors so we're asking there's a pair of ambassadors um in the current cohort who are not graduating uh next year and so we're going to try to work out something and ask them to be uh senior ambassadors we're not sure exactly what that looks like we're still working through it but I think that that would provide an important level of of continuity and and pardon me uh mentoring from one cohort to the next so thanks for the question Cliff yeah that sounds like a great idea and what a what a great opportunity for the students yep well we had lots of great questions thank you so much everyone for that and Jeff thank you um we're drawing close to the end of our time if you have another question please feel free to type it in but I also just want to let folks know that um we do have the uh capacity to unmute you if you're interested in uh making a comment live asking a question live uh so if you would like to do that now we are recording and if you would like to do that on the recording please feel free to go ahead and raise your hand and I'll call on you um and I'm sorry I will I will unmute you um because I can call on you all I want and nothing will change unless I actually unmute you so um but go ahead and raise your hand or type in your question and while we're waiting to see if there are any other questions or if anyone would like to make a comment I just want to go ahead and share with you in the chat box there a direct link to the schedule for CNI spring 2020 virtual meeting which we'll be continuing on through the end of May so we have lots and lots of webinars yet to come and I hope you'll take a look at that sign up for some and let us know what you what you think so uh looking at our boxes now I see we don't have any more questions so I want to just thank you again Jeff this was such an interesting talk and we really appreciate you coming to CNI to talk about uh your project and the ambassadors at Arizona it was fascinating and please go ahead oh I was going to say thank you it was it was cool to talk and thank you also to all of our attendees it was great to have you here so