 Hello everybody and we really want to thank CNI for inviting us back to give you an update on the National Student Data Corps which we presented as the Northeast Student Data Corps at the end of 2020 at the CNI virtual conference. My name is Lawrence Hudson. I'm Executive Director and Co-PI for the Northeast Speed Data Innovation Hub and Founder of the National Student Data Corps and Emily Rothenberg is presenting with me today. Emily is the Program Manager for the National Student Data Corps and we all work at the Northeast Big Data Innovation Hub at Columbia University and this is funded by the Northeast Big Data Innovation Hub award from NSF and we're very grateful as always for their support and their funding. So we're going to talk today about teaching data science and building data literacy nationally and globally with the Northeast Big Data Innovation Hub and National Student Data Corps. So going on to the next slide we'll tell you a little bit about the hub first. When we presented a couple of years ago Cliff suggested we give you a little update on what is the hub. So the Northeast Big Data Innovation Hub is one of four big data hubs that was funded by NSF in 2015 for regional hubs and our goal is to improve the data science literacy and data science innovation working with our regional ecosystems and nationally and even globally. So the Northeast Big Data Innovation Hub team is myself as you can see on the left Lauren Close our Operations Communications Manager and Emily who I already introduced you to the NSDC program manager. We're also very fortunate being at Columbia University that with our award we can actually hire part-time Columbia students. So they're usually master's students, masters in data science or other programs. We have some undergraduate students and they add so much value to help us understand the community because when we're talking about a student data core we're serving students and these students actually help to serve the other students which I'll tell you more about. So on the next slide a little bit more about the Northeast Big Data Innovation Hub. You can see we're funded initially by the 2015 grant 15 the one that starts with 15 by NSF then we had a cybersecurity risk conference award another award in 2019 then we also have a COVID information commons which we do make a separate presentation on in this in this series and then we also got an award from NIH for the AMA head program artificial intelligence machine learning to advance health equity and diversity where we leveraged the NSDC materials. So as the hub where community convener collaboration hub and catalyst for data science innovation in the Northeast region and really around the world we're a community of over 8300 individuals from 1375 organizations which across or go across all 50 US states and 61 countries so everybody can join of the 8300 individuals that we have now in the hub and it's grown six times since I took the executive director role the beginning of 2020 it was about 1400 people at the time half of those individuals the 8300 are in the Northeast region the rest are across the country and around the world so we've really the COVID COVID pandemic in a way really allowed us to broaden our reach because we did everything virtually and so many more people were able to participate and so we build a diverse equitable and inclusive community with all these individuals with accessible resources which is very important what we do we have four main focus areas one is education data literacy where the national student data chorus opinion program the second is health the third is urban rural communities and then responsible data science including security privacy and ethics and you can learn about any of these by going to any big data hub.org slash about next slide please so within each of these focus areas we have activities that guide how we engage with the community with the students with the researchers the professors with professionals with industry nonprofit and government and as you can see in education data literacy the premier program at the top the top is the national student data core the NSDC we also have a number of other cool programs we actually just announced and we'll tell you a little bit more about this when we were talking about this in 2020 we were mentioning that IBM had created an open data science for GitHub repository and the new news this year is that they've transferred the management of that GitHub repository to us to me and Emily at the hub in the NSDC we have a data science resource repository we're going to be telling you more about and a lot of other really cool programs that that you'll be hearing about in the health area our COVID information commons is our premier program and you'll be hearing about that in another another event and we also have a kick student paper challenge we call the COVID info commons the kick and we have working groups and other programs to increase engagement and research your collaboration in the health focus areas. Urban tool communities and responsible data science are two other important areas for us that isn't really what we're talking about in this session today so you can always go to our our website and click through to any of these and learn more about them and get engaged next slide please. So when we were chatting with you a couple of years ago the National Student Data Corps was the Northeast Student Data Corps as I mentioned and we were just launching at the end of 2020 we had pulled together a founding committee which was 24 individuals know it's kind of grown a little bit since then and we were creating three teams and our goal in our founding committee called participation was to develop a community developed initiative that provides resources and opportunities for students to learn data science in a community of support with a special focus on engaging underserved institutions students and communities and that's exactly what we've done. When you saw this couple years ago if you look at the the end of 2020 presentation on the CNI site we had the content and pedagogy team parent instructor mentoring the outreach team we didn't have all the humans into those different teams yet now we do and so you can see here we have students we have professionals these are all people that are on this this founding committee team for us and we're very lucky to have them there and content and pedagogy we actually create content leverage content and then figure out a pedagogical path through the content to make it valuable for the students. The parent instructor mentoring team is mostly students mostly grad students that actually teach either they record the material and the open ds for all github repository in like 15 to 20 minute videos in our video library or they also do mentoring and office hours for students doing data science projects which is one of our new programs and then our outreach team helps us engage with others in the community helping the the students themselves and the nsdc do outreach and bring others into the program next slide please so the vision we had when we presented this at the end of 2020 is that we might have hundreds of students so we've exceeded the vision so as of june of 2023 the nsdc community is made up of 4,430 individuals from 613 institutions across the us in 28 countries and when we we did our first program we talked about this at the end of 2020 paid the founding committee announced our first data science career panel in february 2021 and over 700 people registered for it and we were shocked we couldn't believe it and so that was the beginning of this very deep and broad journey of bringing data science awareness education to so many students around the world professors and educators as well so it's it was a northeast student data core we kept the end and change it to national student data core since we're funded by the national science foundation and we say it's national with an international reach and it keeps growing and we're so fortunate and we really hope that you can get involved and engage with us so emily is going to be taking you through some of the programs that we have that are engaging this community and we'd love to have you get involved in too emily thank you Florence yeah so i'll be telling you all a bit more about the nsdc resources so the nsdc provides over 200 resources that are open free and accessible to its broad community here you can see the nsdc learner central video library and educator central the learner central is designed for users to learn data science at their own pace these resources can be used as supplemental education for learners at any point within their learning journey we have topics that span data science ethics machine learning computer programming and more the learner central also includes the open ds for all github repository which hosts educational modules that may be used as building blocks for data science curriculum or asynchronous learning but we'll touch a bit more on that project later the video library teaches data science topics from introductory to advanced in short 15 minute or less videos these videos are easy to watch and cover a variety of lessons and use cases including how to use data science to detect lung cancer how to use data science to detect clean water sources the importance of ethics in this field and much more additionally the nsdc hosts the educator central which houses resources that help educators whether they are building a new data science program running a lab and looking for data sets or just interested in how they can incorporate data science topics in their classroom you can find programs courses and curriculum here from many different institutions including columbia university university of rochester queensboro community college and others these fantastic comic books which you can also find in learner central are a fun resource that supports readers in learning about data science ethics data responsibly was co-created by one of the nsdc founding committee members julia stojanovic over at nyu these comics go over ai and whether it's being used ethically in various scenarios certain books are also provided not only in english but spanish french portuguese and greek as well so there's truly something for everyone next up will be our career and professional development resources so users may leverage the career central to find the latest job listings internship opportunities research funding opportunities and practice materials for interviews relevant to the field of data science we also have a section on this page of career resources specifically highlighted for women and minorities instead stem with a goal to make this field more diverse and equitable on the right side of the slide you will find out how to get real with the hub so if you want to get involved with the hub or the nsdc but aren't sure how to get started you can browse our research experience and leadership opportunities or real opportunities um these include programs opportunities and resource offerings for individuals of all professional and educational backgrounds no matter your skill level um so we welcome you to find the perfect opportunity for you another leadership and collaboration opportunity here at the nsdc is the chapter system which is a community of support for learners of all ages that provides learning resources mentorship and career opportunities support each data science journey there are currently 24 institutional chapters spanning the us india and bangladesh chapters participate in a variety of activities including hackathons career panels and outreach efforts to their surrounding communities including high schools so if you or someone you know is interested in creating and leading an nsdc chapter at your institution or organization or region we welcome you to join this community next we have the data science career panels so the nsdc regularly hosts these virtual events to showcase the various domains that data science can be applied to we've hosted panelists that use data science in animal conservation white house policy music and oceans studies just to name a field it's so interesting to hear the diverse set of panelists discuss their experiences and insights and you can hear from additional panelists during upcoming events or you can watch our on-demand panels hosted on the nebd hub youtube channel to be inspired the nsdc is also launching its new master class series a video series designed to highlight how data science is being leveraged in various domains these domains include athletics finance education and more the first episode will discuss data science and artificial intelligence in precision oncology and we welcome you to visit the nebd hub youtube channel to learn more on this topic and those to come the nsdc data science project program is another great resource for learners at any stage so we created this program for our community as we were hearing that many individuals were looking for hands-on experience with data science projects in this program we provide a google collab notebook with a full walkthrough of a project inside we offer resources and mentorship calls along the way and a certificate of completion at the end that they can share with their networks we are currently hosting two projects a data cleaning project using a netflix dataset and a sentiment analysis project on an imdb movie reviews dataset future project topics include database management financial data and time series analysis on energy consumption additionally for students who are looking for opportunities to apply their data science skills the nsdc hosts an annual project challenge called the data science symposium or dss typically launched in early spring this symposium invites irid students to complete a data science project of their choice and showcase their findings on a virtual poster board we host monthly mentoring sessions with researchers and professionals so that participants can learn the best tips and tricks in completing a research project on the right you'll see our theme challenges or project prompts that have been created to spark some inspiration researchers professors and professionals are also welcome to participate in these challenges as mentors and or judges and additional information can be found on the webpage linked at the bottom of the screen and speaking of mentors the nsdc hosts a global mentorship program which facilitates mentorship relationships through the nsdc slack channel which now includes over 610 students professors researchers professionals and other data enthusiasts so we welcome you to become an nsdc mentee or mentor through this program and to join this very supportive community so now i'll pass the virtual mic back to florans for some additional information great thank you so much emily and i have to say that emily joined us at the end of 2021 and she's created a lot of this so we're very grateful for her as the nsdc program manager for everything she's done to engage with the community and to bring these great programs to so many people to help them learn how to do data science today and into the future so when i took on this role at the northeast big data hub in 2020 we started putting together a place where all the artifacts from the northeast big data innovation hub program could be found easily so we created this data science resource repository since then a lot of people have sent us information um and they we have over 800 resources in there including data science best practices project examples um different things to increase your knowledge capacity some of the information that's in the learner sensual and educator so content and pedagogy data analytics examples, COVID information, industry resources from IBM and Microsoft and Splunk and all sorts of things and so there's we also have over 75 resources that are in Spanish because we work a lot with minority serving institutions since our mission is to bring data science awareness and education through this program to underserved students institutions and communities and it could just be that they don't they're underserved from a data science perspective that they don't have a data science program or they're underserved in many ways so we work a lot with our MSIs and we work with almost 100 different Hispanic serving institutions and Spanish is actually an important language in the northeast U.S. around the world and around the country so we're adding more and more Spanish resources as well and so if you have any resources you'd like us to consider adding to the data science resource repository you can go to the website you can email one of us we'll have our emails at the end and we'd be happy to take a look at it and maybe it'll make make it into the DSR when you get to the DSR you can actually browse it in multiple different ways we host it on air table and the resources are grouped by subjects or topical areas you can browse as you see on the left search and filter resources by keyword or tag you can actually look at the title or the subject you can see titles there the subject could be coding python statistics machine learning data analytics all different things and then the format is an article is it a journal is it a book is it a video because it could be that you're trying to find like all the videos that you can better available that you can use to learn quickly or to bring data science initially to some people so this is a great opportunity for you to actually parse the information into a way that's most useful for you next slide please and then we were talking a little bit about open DS for all which IBM created with the Linux Foundation for AI and data and they just this year in 2023 asked the Northeast Big Data Hub and National Student Data Core to take over management of this because we were very involved with it we leveraged it we had our students actually record some of the PowerPoint videos the PowerPoint slides that are in there and and talk through Jupyter notebooks and and Python code and so they asked us to lead it so there's a great opportunity for you to get involved what we're going to do is to create a new technical steering committee we're going to create a content review team and we're always looking for new content developers so if you're interested in this please go to the call for participation the content review team we would like to have include not just professionals but also grad students postdocs so that they can learn how to be peer reviewers of content which we think will help them in their research or other careers into the future so this is a great opportunity to leverage dozens and hundreds of resources and to be part of the team that actually creates more for dissemination around the world through github so on the next slide we want to say thank you thank you for having us thank you if you've participated with Hubbard NSDC programs and we hope that you'd like to continue to do that or to get started with us so you can see our email addresses here you can always find us at the hub website you can sign up for our mailing list so that you get our newsletters join the community or the Slack channel we also have a YouTube channel if you'd rather watch the movie then read the book as we used to say join an event or ask us any questions so it's really been a pleasure chatting with you and we really hope to see you soon thank you