 Hi, good afternoon. My name is Bill Moses. I'm managing director of the Kresge Foundation's education program. The Kresge Foundation is a private foundation located in Metropolitan Detroit that provides opportunities for low-income people living in cities. Our education program promotes post-secondary access and success for low-income first generation and underrepresented students. Before I begin, I sincerely like to thank the New America Foundation for its research and for the great work they've done for this meeting on predictive analytics on higher education, especially Iris Palmer, Manuela Coho, and Amy Leightman. First, I'd like to talk a little bit about why big data is especially predictive analytics become a big issue, I guess quite appropriately, in higher education. As most of you know, the United States used to be the best educated nation in the world. But about 20 years ago, we started really slipping. And at this point, the United States is maybe about in the top dozen, 12th or 13th best educated nation. The best educated nation is Canada, or Korea, depending on the survey or the year. But the issue is when you're not the best educated nation, of course, you're not really able to compete globally. You don't have the talent that you need to run the businesses and the government departments and the non-problem organizations that really can keep an economy going. And so in the 21st century, we're operating with a little bit of a handicap compared to some of our competitors who are doing better at educating their people than we are. Moreover, there's a huge equity gap in higher education between different ethnic groups. And the fastest growing ethnic groups tend to have college attainment rates that are lower than the national average. And as the country becomes more diverse, the fear is, of course, if those communities don't get better educated, that we will be on an even steeper slope to lower college attainment. Some people think that higher education is about as good as it can get in the United States at about 45.8% according to the Lumina Foundation, if you include high quality degrees and diplomas and certificates. But the challenge there is that we know that it can get better. There are places in this country that have 60% college attainment rates and in other countries, as I said, that have them as well. And so we at Kresge, you think that you can, in fact, increase college attainment in the United States. And there are certain ways you can do that. We don't think you can get to the economic glory days of the 1950s. But we do think we can improve the attainment rate so that everyone has a chance to live the American dream. Now, I think one of the issues you hear is that higher education attainment is really hampered and handicapped by low K through 12 educational attainment and preparation. And I think that preparation is probably a drag on college attainment. But we're beginning to think that there are other things that may be happening that also drag success for students in higher education. And this includes everything from ill-planned course, scheduling, weak advising, issues relating to poverty and about whether people have access to the kinds of things that we expect students to have, like housing or food, that all these things combined together often conspire against students really succeeding when they get to college. One of the challenges, though, is that I think a lot of academic leaders and a lot of people in general tend to think of the students in higher education today as being very much the stereotype of what a student once was, 18 to 24 years old with a book under his arm, attending a residential college. And that's just not true. That group of students, the traditional college student, is, of course, no longer the majority of college students. The majority is older, less affluent, less well-prepared, taking longer to go to school and often going to more than one institution. And so one of the things, though, we also see today is that we're all constantly part of algorithmic analyses. Various companies, various firms know what I'm likely to want to eat for lunch, what I want to eat for dinner, where I'd like to live, the kind of clothes I want to wear, what kind of book I want to buy. Of course, we've all seen that. You may also be interested in this book when you go to amazon.com. So big data is here. But we haven't seen as much in higher education, but that's begun to change. And the idea is that we think we've seen evidence that a lot of higher education institutions can do a much more effective job using big data to support students and to see why students are failing or succeeding. But I think one thing that's important to know is sort of what we mean by big data and data analytics and predictive analytics. So if you think of data analytics, that's where you take information from a wide variety of sources and look at it and figure out, well, what's going on? Why are students not succeeding? Predictive analytics takes that information, goes another step further, and says, based on this information, we think that this student with this profile is not as likely to succeed, particularly here or there. And that perhaps we can alter what we do to help those students succeed. So if the students are typically from this background or with this kind of GPA or with this kind of grade in a test in the midterm, we know that if we do some advising here, they're 20% more likely to succeed or something like that. So you can take that information, disaggregate it, and say, this is what we should do to help a student. Now, just some examples of where we've seen some success. So we've made a grant to the University of Maryland University College, which is out in College Park. It's a predominantly an online institution. And they ended up getting more than a billion separate data points. I know there's someone who used to be at UMUC in the audience. And they took more than a billion data points and could predict transfers from Montgomery Community College out in sort of Maryland and PG County Community College with about, I think it was like, an 85% certainty rate, whether that student was likely to succeed or fail. So that's sort of the first analysis. We know why the student's likely to fail based on the information we have. And then we've seen things like Georgia State, where Georgia State has gone one step further, taking that information and then said, well, we can change what we do. So we know that students seem to be failing because they're not getting good advice as to what they should be taking. So we're going to strengthen the advising that we're doing. Or we know that students who are in a flipped classroom in this introductory math class actually succeeded a much higher rate with exactly the same faculty and exactly the same syllabus as people were in a standard lecture. So why don't we just switch to everybody being in a flipped classroom? So we're seeing how this can work. And Georgia State has, in fact, begun to absorb other institutions and has seen similar kinds of success as they move ahead. And I think one of the things that's amazing about Georgia State is they've hit one of the holy grails of American higher education, which is that you now have African-Americans, Latinos, and whites all graduating at the same rate. And this is why, while Georgia State saw a drop in their average SATs increased the number of students with Pell grants and generally saw students who a lot of colleges would blame as a reason why they couldn't graduate more students. And again, they would credit this by looking at the data, figuring out where there were problems, predicting where those problems might be and then solving those problems for their students. So we see this as a really wonderful opportunity to help out low income first generation students and not to rely on anecdotor gut instinct but really say, where are they failing? I think one of the things that I like to liken this to is it's like if you were testing a new medical treatment and you knew this could cure cancer, this could cure some other illness. And you say, well, how could we possibly move forward without using something as useful as this tool? And the other side of this is something that we saw also in Maryland actually. Mount St. Mary's University, if I got that right, Mount St. Mary's University, where an incoming president from the private sector wanted to decrease enrollment before enrollment was finally set so that he could demonstrate a higher retention rate. So you accept as many people as you can bring in, get the deposits, get the initial first semester money and then as soon as you see that they're not succeeding at the predicted rate, you drop all of them. And he liken this to killing all the little bunnies and taking a glock to their head. So we see that there's also a downside to big data and a downside to using this information that it can be misused. And when we were in South Africa, we also have a program in South Africa where we promote post-secondary access and success. We interviewed some students and a student said, when he learned about the possibilities of a place like Georgia State, he said, well, I'd like to benefit from that. If you know how to help me succeed, I don't want you keeping this from me. But he said, I also am concerned that my information is going to be used by my enemies, is a phrase he used. And he said, you know, and I wouldn't want to have my opportunities tracked so that I didn't have any ability to change my own lot and any ability to change my own life's trajectory because I think I can still do that. And so the idea then became, we started talking to New America and said, well, who is using big data at this point? And they went out and they did a fabulous landscape study. And then they also looked at, well, what are the ethical consequences of this work? And I think that this is really why we're so thrilled to be working with New America on this is because we know this is coming. I think, I think, I don't know, I suspect many of our panelists may agree, but they're better informed than I am. But I think this is the technology that is the most likely to have huge impact in higher education in the medium to short run. And I think the idea is, if this is gonna start really expanding, how do we make sure that higher education does it for the most effective reasons to support students and to make sure that students are not boxed in, that their data is protected, and that they have every opportunity to succeed. And also we don't want institutions to shy away from using these tools under the guise of trying to protect privacy, when in fact, if they would do this, a lot of students would graduate with degrees and have less debt. And to us, that would be just as equally important an issue as good privacy. So I just wanna give you a sense of who we are, why we supported this research originally, and thank all of you for coming. And now turn it over to Iris. Thank you so much, Bill. I'm Iris Palmer. I'm a senior policy analyst here at New America. And me, or my colleague, Manuel Icoa, and I are the authors of these two pieces of work that we're talking about here today. So last fall, we released our landscape analysis that Bill alluded to where we really looked at how higher education are using these sort of analytic tools. We found four main ways that they're using them. The first is to recruit, admit, and offer students aid, so the sort of enrollment management function, to identify students at risk of failing early and being able to intervene, something Bill talked about, or an early alert system, offering students guidance on the courses or degree plans that they can use. So a recommender system. And then to help students actually reach their learning goals inside the classroom, so adaptive technologies, and actually helping deliver the content more effectively. But as we were sort of doing this analysis, we were keeping our eye open for challenges, and ethical challenges that students might have, and that schools might have in using this data. And pretty quickly we started running into them. So one is the idea of not allowing choice that Bill brought up. We actually did run into the story of a student who was told by a professor that she wasn't cut out for the STEM degree that she had chosen because that's what the system said. That's the kind of way we do not want these tools to be used. The second one is profiling students. So if you start to discourage students from taking certain paths, those students might fit a particular profile. If your algorithm isn't actually predictive beyond the fact that somebody is a low income student, that's not an effective algorithm. That's just profiling a student. The next is transparency. It's just very important for institutions, faculty and staff to know what's going into the algorithms that are being used so they know that what's coming out of them is good, for lack of a better term. And then the last one is also something Bill alluded to, which is privacy and security. Who has access to what on campus and how do you keep that data secure and only going to the people who need to have it to do their jobs better? So those were the challenges we uncovered and obviously there's great promise that Bill talked about and we also saw that and we just basically wanted to be able to advise institutions on how to implement these systems more carefully and ethically. And so that's why we released this spring our five guiding practices for ethical use that try to advise institutions to think through how they're implementing these systems. And I have to say in our research, while everyone we spoke to was implementing the systems with the best of intentions, there hadn't been a lot of systematic thought about these particular pieces. This is very early in the field and so we do hope that these practices will and principles will actually affect how institutions are implementing these systems on the ground. So the first one is to have a vision and a plan. And that just means bring together the key people in your institution and make important decisions about how this data will and will not be used, making sure that you know what the purposes are and that you don't necessarily deviate from those purposes. You definitely see when you have data available, it becomes easy to use it for more and more purposes without thinking through some of the consequences of those purposes. The next one is to build a supportive infrastructure. And this means assessing the capacity of the institution to use data effectively, but also communicating to the people on the front lines who will be using this data because it's actually going to change how they do their jobs if it's used effectively. So training them and communicating with them is really important to building a supportive infrastructure for these tools. The next one is working to ensure the proper use of data. And this is actually just making sure the data itself is good and complete because you can come to some pretty crazy conclusions if your data is not complete and high quality. And that make sure you're accurately interpreting it. This also comes down to some training just because somebody is tagged as maybe it being high risk of failing, that doesn't mean that they will fail. Knowing that this isn't destiny, this is simply a flag to help direct resources of the institution. And that this also includes making sure your privacy and security is taken care of. And that's a whole another set of principles, but very, very important. Number four is designing predictive models and algorithms that avoid bias. And this just means knowing what goes into those algorithms, picking your vendor well, so you do know what goes into those algorithms, and then testing those algorithms to make sure that they're not having desperate impact on your students. And number five is sort of where the rubber hits the road. It means it's about intervening with care, which means even if all of your data is wrong and you reach some terrible conclusions, if you intervene with care, it doesn't necessarily matter because you're not gonna end up accidentally discouraging students from doing what they were going to do anyway, or telling them that they're not college material. So carefully communicating to students when deploying interventions. We've seen a lot of research that shows if you're not careful about how you communicate, you'll end up discouraging students in telling them that they're not college material. It's also this piece about training staff on implicit bias. So we all have implicit biases. It's just a fact of life. And when you see something in black and white that can confirm your bias, instead of having you actually look at that bias more closely. So doing some training around that and the limits of data with your staff and making sure you evaluate and test your interventions. So just because you think something might actually work, you must test it to make sure it actually does work. And that's what this data will allow you to do if you structure it correctly. And with that very quick overview, there's a lot more in there. I encourage you to read it. I would love to welcome our panel if you all would come up. Well, thank you, Bill and Iris, for those wonderful introductory remarks. I'm Manweli Kullo, I'm a policy analyst here at New America and I focus on technology and higher education. And I'm really excited that George Siemens from the University of Texas at Arlington could be with us here today, as well as Mark Milliron from Civitas Learning and Sylvia Singh from Central Pete Mott Community College in North Carolina. We were also very happy for them to all sit on our advisory council that we put together to help shape the ethical framework that we've recently released, predictive analytics and higher education, five guiding practices for ethical use. We were grateful for the different perspectives that they were able to bring to that conversation around ethical use with predictive analytics. So for example, George was able to bring the researcher's perspective to that conversation, Mark was able to bring a vendor's perspective, and Sylvia was able to bring the perspective of an institutional leader. And so we're really excited that you all be able to hear those three different perspectives here today. So before we start into the discussion, I will let them briefly introduce themselves and talk a little bit about how predictive analytics is relevant to their work. And we can start with George. Thanks, Manuel. I'm George Siemens, University of Texas at Arlington. I lead the Link Research Lab. We're involved in a large number of research projects that have been funded from corporate partners such as Boeing and Intel, several NSF grants, and related grants with the Bill and Linda Gates Foundation. A lot of the activities around predictive analytics that we're turning to, and if you look at our current research projects listed on our website, we've turned more so into the intervention part of it. What do you, when you know something, what do you do with what you know? And so we're focusing a lot on affect and emotion and those kinds of factors because as an article a couple of years ago, the New York Times looked at who gets to graduate. It's really not about, are you as intelligent as the person behind you? There's an entire support structure that exists behind the scenes. So our interest is around, how do you intervene when you've identified a student who may have some, or who as an institution may have concerns about them completing the course or the program? My name is Sylvia Cheney. I'm a director of special projects at Central Piedmont Community College in Charlotte, North Carolina. Our involvement with predictive analytics was really born from a grant project funded by the Gates Foundation and focused on course level interventions to improve equity and access for students who are low income and more of that non-traditional population. So our interest is really in looking at what students are doing in the classroom that helps them to succeed and helping to communicate with them at an individual level to improve their academic outcomes. Mark Milliron, co-founder, chief learning officer of Civitas Learning. Civitas Learning is a social purpose corporation founded a little over four and a half years ago, totally focused on trying to help institutions make the most of their data to help their students learn well and finish strong. We're doing, we work with about 300 institutions around the country. We have about 7 million active students in our data set, about 200 million enrollment records, one of the largest kind of active learning data sets out there. We do kind of three big things. One, we help stand up a student insights engine per institution with a cloud based analytics infrastructure to create institutions specific predictive models so they can turn their lights on and see what's happening with their students. That powers then two other things, which is advanced what we'd call personalized pathway solutions, which are things like our degree map solution, which helps students pick courses, pick majors, make those big choices and another product called college scheduler, which actually helps them fit their schedule into their lives, right? And kind of understand how you pull those things together, all about helping those big decisions on the student's pathway. And one of our big beliefs is the idea of getting data in the hands of the front lines, helping the students themselves have the data so they can make better choices. Second big thing, the student insights engine powers is what we call precision support. And precision support is knowing who are the right group of students to reach out in the right way and be able to do as Irish is just talking about, the idea of launching interventions and then testing them. The idea of understanding which students are right for some kind of outreach, some kind of inspiration, some kind of intervention, doing the outreach and then testing it. Second part of that is also, we have a tool called impact and impact actually, because we have a student level prediction for every student at the institution, I'll get geeky for a second. We can do propensity score matching analysis, which means we can actually do deep studies of all the family of initiatives people are doing, course level initiatives that they're doing, course redesign, they're doing flip model classrooms that they're doing, they're using products or strategies like learning labs or online tutoring, we can actually test the impact of that. So an institution can look at their 30 solutions and say these 15 are having really good impact. And I want to double, and then we can actually disaggregate that by different student type, which allows them to be more precise and the outreach to do more importantly, it gives them a concrete ROI about what they're, in terms of what we're doing, this is the kind of persistence lift we're seeing, even to the point of seeing the kind of financial outcome. Big bang for the buck on this is we are, we're trying to get this notion of a community of practice around analytics so that people can really take data seriously and then have these kind of conversations. That's why we're incredibly excited to be in this conversation and New America's done a great job in helping catalyze an important conversation. Thank you for those. So there's a big myth out there that technology and data are neutral and so this only enables faculty members and staff and advisors to make more objective decisions. So it basically removes our own personal biases, it removes our own values from how we make decisions. And so I'd like to get your point of view about why this isn't the case. How is it that predictive analytics is not immune to bias, it's not immune to our own personal values? And I'd like to hear from everyone but we can start with George. Oh, that's a huge, huge issue and I think what any kind of technological instantiation of our thoughts and our social practices does is it's basically a mirror, it reveals back to us at least some or a portion of the biases that we do have. And so a few simple illustrations of this would relate to stories that many of us have heard around criminal sentencing, for example, that takes into account some kind of an algorithmic model whether it's public or not, and we suddenly realize that we're discriminating against certain populations for things that have nothing to do with the actual intent of sentencing. When you look at it from a higher education end, I've said this for years when we first started learning management systems come into higher education in late 90s, early 2000, and that's that technology is ideology. It's not neutral, it's not value free. And so when I choose to use a learning management system in a course, I'm making significant pedagogical decisions. I, you know, how I'm gonna teach is gonna be impacted, how I'm gonna interact with students is gonna be impacted. So basically the tools that we use are ideology. I think it's dangerous to assume that something like technology is free of that. In fact, technology, if anything, it's technique. It's an instantiation and a solidification of those practices. So what does that look like in an educational environment? Well, I think there's going to be challenges with a predictive model that's going to flag certain factors. So if you look at a regular predictive model in the university, they're gonna use a number of data sources to build that model. Most likely they're gonna bring in something around a student information system, and that'll likely have any number of data points around the background of that student, the degree completion of their parents, the income of their parents, the zip code they're from, and so on. And coupled with that, they're gonna bring an additional data points that might come from a learning management system. If university's a little more aggressive, they might also start to collect use of university resources. They might use swipe card data and so on. And when you bring all of these together at every stage, there are questions around the neutrality of that data. Obvious student information system data, anybody from a poor economic zone, the zip code alone will indicate and flag at risk and push that student to the forefront. A student who is waiting for things like funding to come in won't have access to a textbook. Somebody who's purchased an e-textbook in a timely manner, for example, is gonna have a better chance of succeeding. Those are the kinds of factors or variables that we're looking at. In some cases it can be good, but when you're holding glocks to bunny's heads, that can be a bad thing. And that's, I think, the core of the issue, which is why so much of our interest now is around affect and emotion, because much like we've talked about our students today are no longer traditional students. We have a lot more in that our part-time students are not the regular full-time students. I'll add a second component to that, that statement that we don't hear about as much, but education is no longer a primary cognitive process. Education involves so many related factors of being an affect and so on. We won't solve the issue with high levels of dropout, by focusing on cognitive factors of education. It's an entire support base of family and so on that underpins it. And quite often, those are the variables that raise the flags that students from certain economic profiles have the most difficulty with. So if our predictive models are based on that, we're gonna see those kinds of mindsets and biases reflected in the results that we're seeing. Great, so technology with ideology. Mark or Sylvia, do you wanna weigh in? Sure, I think it's kind of interesting to have that high-level perspective because what we're doing in our work at the course level is so much more granular that you don't run into a lot of the interactions across institutional systems. So the instructor won't see the students of code. They won't see a lot of the demographic information that would actually help them to make maybe a biased judgment of what that student's work and potential could potentially be. But there is a lot of information that they are just picking up through that interaction with students through their face-to-face conversations. And it kind of goes back to this idea of technology really building on what we're already doing and allowing us to continue a lot of the same kinds of interactions with our students, but be more cognizant of it. And so I think it's a good moment for us to slow down and think a little bit more about all those pieces of information that we pick up in different ways, not just through the piece of technology that we're using. I'll tell you the interesting thing about the bias issue is the realization that data is only as good as how you use it. And some of that goes into the conversation George is bringing in about the perspective people bring. And for years, data-driven decision-making and higher education IR directors would tell you what that meant is that the administrators made a decision and they drove you to find data to back up the decision they already made. And I think unfortunately as quote unquote big data systems came into higher education, some of that got exacerbated because what happened was people used these systems to build algorithms that reinforce what they already believed. So you ended up with a lot of demography as destiny conversations where you had regression against demographic variables. And what our data has really been showing is it's not that what you think. We just reproduced two community insights reports and a couple of the findings that kind of blew people away was one is engagement data is far more predictive than the demographic data. It's what students do that is predictive of their behavior and this hyper reliance on the categorization and the bucketing of students is a real challenge because it reinforces your perspective. And so if that is built into your algorithm from jump and you're looking for reinforcement for why it's true as opposed to taking a step back and in some ways letting machine learning show you what's actually happening, it changes the conversation. The second one is things like most of the triggers for these early warning systems, for example, are built on academic triggers or below 2.0, academic probation, failure on a gatekeeper course. And what we're seeing in our data is that catches about 20 to 30% of the students who are actually leaving. And I think one of the data points that blew people away is across our 7 million student base, really the 4 million student base for the study, we saw between 40 and 48% of the students who were leaving were between 3.0 and 4.0 students. They were high GPA students who were leaving and they were leaving for very different reasons. They weren't leaving because of academic, they were leaving because of psychosocial reasons. They didn't feel like they belong, they weren't adjusting. Or life in logistics, they couldn't get the courses they need and the time needed. The transfer agreement was terrible or whatever was happening. And I think that's beginning to change the conversation. And I think when you're talking about the algorithmic bias in here, I think you have to remember, some of the work and if we do data well, I think can actually shine a light on some of our biases from the beginning. Because I think we can then start doing the different kinds of outreach to help students across the board in a very different way. Because the truth is, big data is not inherently, you could have big, bad data, you could have big, good data. I mean, part of our challenge is, and I think you guys said it well in the report, which is it's all about how you use it. Is it on purpose and are you actually testing to see what the data is saying on the other side? Being willing to believe what the data tells you. So George, we kind of mentioned this already, but I like to hear what others have to say around, so we've established that there can be algorithmic discrimination. So what I'm coming to see more and more is that the conversation is moving towards algorithmic accountability. So how do we make or hold algorithms accountable for disparate impact in the event that a group of students or a group of folks are discriminated against? And this conversation is really being had in the criminal justice system where they're using predictive policing and they're using these algorithms to determine sentencing, to determine where to send patrols to different neighborhoods. And so the question I have for you all is what does algorithmic accountability look like for higher education? Or what do you imagine it should look like? And anyone can jump in on that one. I think part of it is being willing to test what you're trying and to lay your assumptions bare. So we have a big set of no's in our world and one of the no's is no black box and one of the no's is no one-size-fits-all. And one of the things we're trying to do is make sure people get this idea of testing what you're trying and being transparent about it so you can check it out. And I think, just giving an example, we're seeing in our data that a lot of the early warning systems, they throw flags, they throw triggers and people think, well, wow, that trigger's going off, we should intervene. And we've actually started testing the triggers. And what we've seen is one, a couple of big problems. One, the triggers are imprecised and they're heavily focused on the academic issues as I was talking about before. And two, they're not testing what they do based on that trigger. And so they assume that outreach is good. We've actually seen in multiple places, we've seen where students who have received outreach based on a quote, unquote, intervention trigger are actually doing worse. You actually almost start a death spiral where that student already knew they were kind of doing bad and then you called them up and told them, oh, by the way, you're doing bad, right? And it causes a challenge. And so part, and that's where we were telling people, you gotta test to see what kind of message you're sending because what we're seeing, for example, is the best kind of messaging is, you know, mindset messaging and normalizing challenge and helping them understand, okay, so what do I do because of this? And so one of the things we've gotta do is to hold ourselves accountable is not just one, assume that our triggers are right, we gotta test that. And then two, we've gotta be willing to test. Whatever we do, we gotta test it and see what kind of impact are we actually seeing based on that outreach. We can't just assume that because we do it, it's a good thing. We have to see whether or not. And we also, and this is, I love Sylvia Irtec on this, you gotta talk to the students. How did you receive that message? How did it feel when that actually came to you? I've actually heard students say, yeah, that what it felt like was congratulations, you've been flagged, here are the five things you're doing wrong. And you know what I mean? And what we find is a simple little message that basically says, I'm just checking in, are you okay? Has far greater lift for that student. Well, they say, yeah, actually something is wrong and this is what's happening and they'll do the outreach. But that's part of that testing process and holding yourself accountable. I think the most important part for us is making sure that as you're testing, as you're looking at all the pieces that make up that puzzle, is having everyone in the process involved from day one, making sure that all the stakeholders, not just students, but your faculty, and your admin, and everyone who's touching that data needs to be impacted by it, has their hands in it, has their thoughts heard in that circle. I think that one of the things that could tend to happen very quickly is that that data goes into your institutional effectiveness or institutional research wing and it gets kind of owned by the wants, the people who really understand it and can geek out for a minute, but it really needs to come back to a broader audience where everyone's part of that conversation. And I think an important aspect as well on algorithmic accountability is obviously openness, which Mark addressed, but it's also being part of the research process and I know there have been a couple instances where there are some adaptive learning analytics companies that are largely outside of the research domain, which means decisions are being made on our students, on their data, that we don't know what exactly happened behind the scenes. So I was involved in setting up the first learning analytics conference, which is about eight or nine years ago now, and our point there was initially we just wanted a space for academics to share as they engage with educational data, what they were doing and how they were working with it. About four years ago, we added a separate stream to that, which was the practitioner stream. So we still wanted to obviously hear what the researchers are doing, but because learning analytics is extensively used in a practical way on campus, we felt that it wasn't enough to just have it as sort of an esoteric research-driven conversation that we had to start including the practitioner voice. So organizations like Civitas and others could come along and contribute to that conversation. And I think if you want algorithmic accountability, it's exactly like Sylvia was saying, you need the right stakeholders, you need the researchers, you need the IR people, you need the vendors, you need the students, you need the academics involved in those conversations. And then we'll do some design thinking. I mean, getting the faculty and the students and the advisors who are actually wrestling with this data, they can bring the data to life. I mean, remember, all these bits and bytes, zeros and ones are people with stories. And when you start unpacking the stories, it really kind of makes you understand, okay, this is the challenge we're trying to overcome. And it makes you, again, much more of a student success scientist. But that means you've got to be willing to kind of get into the mix and kind of bring in the voices. All interesting takes. So Sylvia, I wanted to talk to you a little bit more about algorithmic discrimination for adaptive technologies. Because I think it's a little less clear here only because these tools don't necessarily use the typical demographic data that the early alert systems that are trying to flag students who are at risk or the course recommender systems that are trying to determine or suggest particular majors or particular courses that a student should take. And in this case, in the adaptive technologies world, it's more about the learning science and what we know about how students learn. And so, in thinking about that, what I realized is that there isn't enough conversation around ensuring that these tools are built with all learners in mind. So including learners with learning disabilities or learning challenges. And so I'd just like to hear your take on algorithmic discrimination, not in the typical sense of demographic characteristics like race or age or income or gender, but also ensuring that we're creating tools that are accessible for all learners. Good point. I think one of the things that we want, that everyone wants as an instructor or a faculty member is to have kind of a magic piece of technology that's going to make it very transparent and very clear where everyone's tripping up intellectually in their concept for understanding something that's really difficult. Like instance, we're using it in biology and chemistry courses. Those are really complicated subjects. And it helps to have kind of more transparency into where each learner is having a challenge. But that's not the case. The technology isn't going to resolve that for us, right? So what really happens is that we create more opportunities for faculty to engage students directly. And that's where you see that you're having a more personalized experience that helps meet every learner where they are. So they do get that information from the system that does kickstart that conversation because we can start to see where they're tripping up. But it really comes down to something that we know works very well, which is a faculty member engaging with that student one-on-one to understand a little bit more where they're coming from and why they're having some difficulties. I didn't know if Mark or George if you wanted to weigh in on this piece about accessibility and adaptive technology. I actually think this is one of the places where we can hit head on. This issue that's pretty near and dear to my heart, which is I think if we stick in this reductionistic view that predictive analytics are about helping at-risk students, it is dangerous. I think we have to make sure we flip to this idea of leveraging data to help people make the most of their education pathways. And that means helping high-ability students go faster, helping students who are in the middle get to a higher level. It's about, because by the way, if you think about this operationally, systemically, and sustainably, what you'll think about is if you can help these people go faster and farther, you can actually harvest resources that can help your more challenged students. So for example, in adaptive learning, for example, you have some students for whom they can go twice as fast. They can learn, in fact, they'll learn better more quickly. And so part of what we have to do is figure out how do we optimize learning experiences along with things like pathways and choosing courses and that work. But I think we have a whole group of students who are held back by our systems and structures and infrastructures. And if we can help them go faster and farther, we should do that and leverage data to do it. We have some students that are stuck in the middle who probably can go raise their game and then we have students who are challenged. But I think the reductionistic view of we got to help you at risk, we got to help you at risk is holding us back in a way because we're not gonna open the blinders, right? So I mean that finding around the high GPA student leavers has been something that's just kind of make people gulp because the first one, they didn't believe it. And then they went back and looked at their data and they go, oh, it's actually happening. But that's gonna make you think, and by the way, a little bit of effort with that group saves them and saves them quick and then you can keep that real resources then you can help them more hard to serve. But especially with something like adaptive learning, I'm just in my gut when I talk to students, you hear it all the time, there are a bunch of high ability students who wanna go farther, they wanna go faster and be great if the analytics could say, not only what are you learning in this specific course, but what internship should you have? What co-curricular experience should you engage? What three other students should you meet to optimize your experience here at this university? That's where it's gonna get really kind of interesting. Two projects I'll just raise that were involved with at Link that address some of these aspects of getting the idea of these models down to something more practical that's relevant to individual students. So one we're looking at with a colleague out of the University of Sydney, which is the idea of keeping our students focused on task through adaptive feedback. And one of the things we found was that we love visualizations, but students, they're actually useless. You give a student a dashboard and they spin a few dials and they go away. So we really started focusing on giving students, we cluster students according to set categories and then we have in advance, the faculty member writes specific responses to those student profiles. So at regular periods in the course, we're really focused on the formative aspect of feedback to individual students and that's our on task project and the results are significant in keeping students engaged and keeping students motivated. There's a lot of factors that are unique to the type of student that they are, the life circumstance that they have. And that needs to be reflected through adaptive feedback rather than sort of something that we trust purely will be driven by analytics or algorithmically driven. It gets down to what Sylvia said, a human being at the other end. A second project that's been very successful at the UTA campus is our focus on, it's the idea center, which is a fairly large department of education grant where we're looking at how do you provide non-cognitive support? I don't really like the term non-cognitive, but how do you provide support to students who come from a range of different backgrounds that are having difficulty succeeding? And so we just set up in a physical space sort of a triage environment, students come in and they get support on all the things that if you're a, one of two children in a middle class family with parents and grandparents with degrees, you're very well supported. What if you're a first in family degree completer? You don't know any of those details. How do you provide support for that student? Helping them succeed from a predictive analytics end is more cultural and social and developmental than it is getting them through an academic program. And that's what the idea of our virtual ideas center is, is to take those concepts of moving to digital environments so that you give individuals that kind of support at various stages of their journey that's gonna be relevant to them. Well, as exciting as that conversation was on algorithmic accountability and also discrimination, let's move to discussion to talk a little bit about transparency. And so another major concern with predictive analytics and these tools is that oftentimes they're black boxes, which means that the nuts and bolts of what went into their curation and how they were built by product engineers and others may not be clear to those who are actually using them on campus. So students, faculty, advisors may not actually understand how exactly these tools are making decisions that they're making and are arriving at the conclusions that they are arriving at. And this can particularly be the case for vended products. So if an institution is not creating predictive systems in-house on their own, but they're working with a partner of the data science company, a vendor, they may, because they're a few steps removed from the creation process, they may not actually be privy to how it was created. And so we stress on our framework, predictive analytics and higher education, five guiding practices for ethical use, the need to choose vendors wisely. And Iris talked a little bit about that in her remarks. And so the question I have for you all is how exactly does this look in practice? How does an institution go about choosing a vendor who will be cooperative and who will also ensure that their products are transparent and that they can be used to implement it transparently? Anyone can start? I'll just jump in quickly. Because it's one of our nos, right? The no black box. We've made it very clear in our work with institutions. We want them to become student success scientists with us. So one of the things we do is we immediately start ingesting the different data sources, whether it's SIS, LMS, card swipe, or more likely it's going to be advising system data, CRM, whatever it might be. And one of the things you want to be able to do is validate the data hand in hand with the institution to make sure you're getting the right data in the right place. And there's often a lot of cleanup in that process. But one of the things we have found is the derivative variable creation is where the art of this comes in. And derivative variables or the calculated variables that come across system. And for each institution we'll calculate between 1,000 to 1,500 derivative variables. And derivative variables, for example, are things like instead of, you know, it's low income status, or income status, you'll actually look at the affordability gap, which is the delta between the tuition they owe and the financial aid they have. Those kind of calculated variables turn out to be far more predictive than the point variables. Things like engagement data. It's not the number of clicks in a class. It's actually the relative engagement to that section. And if you calculate the relative engagement on the LMS to that section, because some people use their LMS a lot. Some people use their LMS a little. You wanna look, calculate that indifferentially. But the art of the derivative, in fact, we have another one called degree program alignment score, where you take the modal student success pathway through a given major, not what the institution thinks it should be, but what the most successful students are doing. And you calculate the extent to which a given student is on or off that pathway, hugely predictive. But anyway, you do that work and then we'll narrow it down to two to 300 derivative variables that are driving most of the variance for a given institution. And we're showing it all, in fact, to the institution. And we're actually giving them the English explanations of what all those variables are as we build their predictive models. By the way, 95% of the variance in the predictive models tend to come from the derivative variables. So that transparency really matters. But we had to take it a step further. We started surfacing the predictions, which turned out to be incredibly accurate within the products. We found out things like inspire for advisors. We'd show them a student detail page and they'd see a prediction score for a given student that this student actually needs some outreach right now. But then they would look at the data that they were used to looking at and they would see a GPA of 3.8, they're taking 15 hours. They got all A's last semester and they'd say, this student is not in danger. I'm not going to do any kind of outreach. So we actually had to expose what we call the individual feature ranking. We would show them the top 10 predictors that are driving the kind of flag for that people both through the outreach. And you'd suddenly found out this student hasn't engaged in their LMS in the last three weeks. They dropped one of their courses. They did this and you suddenly see a bunch of those different derivative variables and the advisor kind of gets it. But what we found is that transparency really matters because that trust is a really big deal that people kind of get, okay, that's what's going into this. But that's an iterative process. You've got to be able to work with the institution. And we even went one step further. We actually created a product called Explore, which literally takes all the work of the student insights engine and just puts it into a data mart and says, if you're sophisticated advanced analytics organization wants to use SAS Visual Analytics or you want to use Tableau or whatever it is to kind of do your own work, you can take all of our data products and bring it into your work and do that as well, especially Research One University. They're really a big deal because they got incredibly smart people on their staff that want to roll with it. But even some of our community colleges do an amazing work and they have great IRR offices and they love to take those products and kind of take them in their direction. And by the way, just a shout out and Bill, just to just your talk about Georgia State, what's great about this work is it's not no longer just the Georgia States. We've found multiple achieving the dream colleges that have closed the achievement gap between different ethnicities. University of South Florida did the exact same thing. In fact, getting up to the first year persistence rate of over 90%, the neat thing about this is when you make a transparent and they feel like they're part of the process and they own it, they take a lot more action, right? They're willing to do that. But I think that transparency is a big, big deal. For us, that transparency kind of looked like a classroom. Again, when you start to pull in faculty and individuals who aren't used to opening that box and haven't seen that box before, there's a lot of teaching that has to happen and we were really lucky in working very closely with most of our vendor partners. Must be some of you are working with Smart Sparrow as part of their InSpark initiative and we actually worked directly with their developers, directly with their subject matter experts to look at exactly what their software was doing in terms of every point of data with collecting how it was making decisions about where students should go and what content to direct them to. I think what's most interesting for us is that it's not the only piece of software we have that is adaptive and that as they started to look at this piece of software and understand it more, we had a really good vendor at that end. It caused us to stop and look at the vendors who maybe hadn't been as transparent and start to ask questions that we should have been asking all along. It really allowed us to get to a point where we understood we do need to be more strategic about it but we do have to have those conversations and push them from our ends to have our vendors really teach us and allow us to help be part of how their products are shaped. That's a terrific point to have you on that. I think one thing I'd look at is if you're looking at working with a vendor in the data analytics space in education is are they publishing research? Are they subjecting their ideas to peer reviewed research? So if you go to the Learning Analytics Conference series and do a simple search, are they listed there? You go to Educational Data Mining Conference. Are they listed there? Because if they're not part of the research community, you're on the receiving end of a possible sales pitch and practical details are important. I don't want to dismiss that but I'll give you one example and I'm fine dropping names. So early on we tried to get Newton involved in the learning analytics space. One of the first companies I approached as I was heading up the conference and we particularly asked them can we have access to data? Can we do analysis work in the data? Because they were reporting insane learning gains that just didn't make sense according to literature. And end result was we didn't have access to that. So what I'm saying is if you're not willing to subject what you say is important to the research community, I'd be worried about them as a vendor in an educational space. Another thing I'd look at when I'm talking to a vendor is the importance of do they listen to your unique situation and settings? You look at some colleges, they're gonna have a completely different student profile than an R1 university. In fact, in the US alone, the general US higher education state can't learn a hell of a lot from R1 systems. The vast majority of students that impact success in American settings are community college and other systems. There's a small percentage that go to the top R1s, but there's a lot more going on. So do vendors recognize that part of the space? Are vendors willing to engage with you in a way that develops their product to your specific needs? Do vendors bring in, do they get off the sales pitch idea and actually talk, you feel like you said, we're involved from a research end in evaluating the InSpark product and part of the smart sparrow grant. And we're finding that being able to engage with a vendor that is both a practitioner, but also connected to the research space gives academics a lot of confidence. When a provost has to make a buying decision, knowing that there's a neutral commentary going on about their product goes a long way in building trust. So Mark, you said that your policy is no, no black box and that you all show it all and that you explain how your systems work in plain English. And then Sylvia, you remarked about how it looks like, what choosing a vendor looks like for Central Peabot is a classroom and you want your vendors to teach you. And so one concern or conversation that's being had around predictive analytics and algorithms in terms of being transparent is that we can be transparent, but transparency is really only the first step, that understanding is the second step. And so I guess I'd like to get your thoughts on, should institutions be shying away from pushing their vendors to say, no, show us the data and show us how your systems work and teach us to understand exactly what it is that we're looking at and so that we can analyze it critically. And so I guess the question is transparency may be the first step, understanding is the second step and how do those two things play out at the vendor level, how does it play out for researchers, how does it play out for institutions? We've actually been really lucky at CP. We have a fantastic institutional research department that's really well staffed. It is a luxury that most community colleges don't have. So we're very aware that we have a capacity to work with data and to look deeper into a lot of issues that other institutions just can't do. So I think that first is to acknowledge that you have to make that institutional investment in that area that you have to be interested in putting more resources as an institution into data. It's a having the right people who can speak that language. I think the other piece really comes down to getting to a point where everyone understands as much as they can and that you have some amount of trust. You have to have trust in the people who are the data wants at your institution. You have to have trust in the vendor because there are going to be things that you're not going to get to a part of understanding or it's gonna be proprietary and that's okay. I mean the vendors are there to make some money. It's a product, it's a business, they're entrepreneurs. They're not going to show you everything all of the time. So I think that you have to have a really strong relationship with them to have that point where you understand but also to trust a certain bit. I think it's incredibly important that the people that you're working with are authentically engaging with you and your challenge. They've taken the time to understand your context and you feel like they're partners in student success science. So it's an overused term but you know a partner when you have one. You know a vendor when you have one. It's just different. And I think you've got to have somebody who's willing to sit alongside and they're authentically curious and really driven to help you solve the challenges. And my co-founder Charles Thornberg is fond of saying we are an organization that's obsessed with outcomes, not software. And the idea is we obsess with helping that institution get to the outcomes that they're trying to get to. And those outcomes are always student stories, right? In terms of success. So that's a big thing. The second thing goes to your question and it's probably one of the most important things we can bring up in this conversation which is we're in a big shift in the world of data in higher education. Let's be blunt, 95, 98% of the data work being done in the world of higher education for the last three decades has been accountability analytics. It has been reports. Getting reports to your creditors, to legislators, to trustees, this is a new thing. And because it's a new thing, what happens is a lot of the people who are driving typical IR offices are focused on getting IPEDS reports out, are focused on getting, and those, by the way, are things you have to get done. You have a report, you have to get done. And the challenge is this becomes a new thing and doing these kinds of analytics are different. So our chief data scientist comes from the world of healthcare and after three or four months of working in the world of education, his first reaction was my lord, this field is obsessed with autopsy data, right? He's like, wouldn't they rather have data during the operation or diagnostics so they didn't need the operation? And that's the shift. And I think you're seeing this cultural change happen right now where organizations are beginning to create different kinds of structures where they're going to have one division that's doing the reporting and another division that's doing what I would call action analytics, which is operational analytics, which is the day to day. By the way, you have to have a lot higher tolerance for messiness with the operational analytics because when you're doing an audit or you're doing an autopsy, you can be incredibly precise. But when you're trying to save a student today because they're about to leave, you can't wait a year, right? We've had some institutions where we've actually given them a list, like I won't name the institution, we gave them a list of 1,007 students. We said these students are in the bottom decile of our predictive score. They're incredibly unlikely to stay unless you do some kind of outreach. And they said, okay, that's interesting. Let's go work on this other project. We said, okay, and we worked on the other project. We came back a year later and we said, let's look at that list, by the way. We looked at that list and all 1,007 of those students were gone. Every single one. And I got to give the provost credit because she sat there and she said, I want everyone in this room to realize we knew this before this term. And that's where this, and I want to raise the stakes on this. This is the moral imperative of knowing. When you turn the lights on and you actually have, you know, I mean, if you're a doctor and you know that that CAT scan says that patient has cancer, you have probably a moral imperative to have a conversation with that patient, right? So it's just part of this conversation. It's a different version of analytics and being willing to engage it. Now that carries a whole series of other ethical implications, including having an ethic of do no harm, right? You've got to care about that do no harm. You don't want to track people and go in a different direction. But this is, let's not kid ourselves. This is new. And being willing to have the kind of partner who can work with you to think through this is incredibly important. Wonderful. So let's, there's clearly lots to consider here around choosing vendors wisely and ensuring transparent use of predictive analytics. So the last topic we'll discuss before we open up the floor for Q and A is around data privacy. So privacy is obviously on the minds of a lot of people. Colleges are making use of data in new ways. Mark just said it. This is very new. We really are in a new space and a cultural change is under foot. And so the question, the trouble though is that as colleges are making use of predictive analytics to understand how students are interacting with campus resources. So for example, whether or not they're taking advantage of tutoring services or whether or not they're taking advantage of mentoring services. They're also making use of the digital bits that students leave behind in their learning management systems or perhaps in a MOOC or perhaps in the adaptive tools. And so we were getting, we hope a clearer sense of how students are engaging with their campus and also how they're learning. But then privacy and security also come into play because we have a better vision. There's questions about privacy. So I like for you all to talk a little bit about what new privacy concerns have surfaced for your partners or for your institution as a result of using predictive analytics. And anyone can start with this one. There's from a research end, privacy means something different than a practitioner end. So anytime we do anything as a researcher, it has to go through an ethics board. It has to be reviewed. Universities that decide they want to do something with student data but don't have any interest in publishing it, they're not constrained by those same challenges. So you can certainly have cards white data without students necessarily being made aware of it. And I'm quite confident, recently there was this issue with Uber would do certain things to track activity in certain regions or ways to identify when somebody a possible user of their service might be law enforcement or otherwise. That comes out and it makes a company look bad when it comes out now that they were doing this before. And I think higher education institutions in the next five to 10 years, we're going to see some of these things come out where somebody will say, ooh, universities were doing what? With my data. And so I think that's from a practitioner end. That is definitely a concern. From a research end, the question always is more data. I mean, right now we've spent a lot of time looking at wearables and biometric or psychophysiological data has been a big area of interest for us. How do we look at things like heart rate variability as a guide to engagement? One of the things we developed is an affect grid as part of, are there certain affective states that pre-exist when students discontinue learning a learning session or that pre-exist when a student actually begins to drop out of a course. So we're interested in those elements. From a research end, that's all good, the more the better. Obviously after it's gone through IRB clearance. I think the ethics end though, one way to solve a lot of the ethics or the privacy questions is, do students see what the institution sees? And I know it's not as cut and dry as that, but my view would be if a faculty can see it, a student should see it. If a administrator should see it, a student should see it. So I think that solves a lot of the privacy issues when we realize what's going on and students can flag it for themselves. And as a student, I may say, you know what? Take all my data, install cameras in my house, do whatever you want, but if it's gonna help me graduate, I'm all for it. So I think a student should be able to have some agency in those kinds of data decisions, but you got issues like in Bloom and others that have come up. Once you start touching on data and you start touching on learning and you start touching on privacy, it's gonna be very distinct for each individual. So privacy is like, in digital space, privacy is a transactional entity, like money would be in a physical space. So what I mean with that is, if I go out and buy a hamburger and someone says that'll be like $78, you know, it better be an incredible hamburger, but by and large, I'll decline because I know that's not good value for my money, but I use Facebook and I use other social media tools and I go to university and provide my data. I have no indication of whether I'm getting a fair exchange for that. So once we start understanding that privacy is a transactional entity that gives us access to services, but by the same account uses something from us, once we start to understand the value of that transactional space, we may well end up saying no, I'm not willing to make that exchange for that return. I think that you kind of mentioned it earlier on, you use all these services and as long as it's working for us, we're very happy to kind of check that terms and conditions, maybe without reading them all the way through, maybe with skimming them. Just looking at the level that we're using data for one course, for one section, I think there's four gates of terms and conditions that students are able to check off on that makes them aware of all the information that we're collecting and all the information that we're sharing. I can't say with full confidence that every student who is in that course has fully read every page of those terms and conditions to understand exactly what we're doing, even though we have been fully transparent. And I think what's gonna have to happen is you're gonna have to reach a point where, I'm just being honest, you're gonna have to reach a point where that plain English language is the language that everyone sees where it isn't a dashboard so they see the exact same thing as an instructor where they're a dot and a graph but that they really understand what that is and why it matters and how to use it in the same way that we're educating ourselves to be smarter about how we're looking at their data and to be willing to educate our students. And again, taking into account where they are in life and where they are in their academic capacity. They're a very vulnerable population, especially who we're working with at community colleges. So it's a very challenging issue. I don't think that we've resolved it. I think that we've been very cautious about it to date. We'll continue to be cautious in the years to come. Just providing more information and airing on that, a little too much detail, too long didn't read in terms of making sure that we're transparent. But I'd hope that as we get better with the technology as we get to a place where we're more comfortable with understanding what it is that we're asking for and why we're asking for it, that we're able to have more condensed terms, that we're able to have more compressed understanding and sharing of exactly what that information is that we're collecting and why it's useful. You know, I'm just gonna go back to the basics of this and you all get this. All these data points, all these bits and bytes, zeros and ones, their footprints have a student's journey through our institution. And if you pull these to get footprints together, they tell a story of where they're struggling, where they're succeeding, what's really working, what's really not working. And what I always go back to think about, I was a first generation student. We were just talking about this before in the lunch. I come from a family of nine kids. I have an African American brother, native American brother, Korean sister, 25 foster kids rotated through my house during the time I was growing up. My mom was a special needs nurse. My dad was out of his mind. Big, rowdy household. I was the first one in my family to travel through higher education and I had no clue what I was doing. Literally did not, when I started at Mesa Community College in Arizona, God bless them, if it wasn't there, I wouldn't be here. I didn't know what an AA stood for. I didn't know what an AS was. I kind of knew what BS was, right? Just kind of figuring it out as I was going along and thank God the right faculty members showed up. Jim Mancusa, Rick Myers, I can remember them, they just totally changed my life and got me on a very different kind of trajectory. But I went through and got my, you know, my bachelor's and master's and PhD, but I would say I was completion by serendipity. I was not completion by design, right? And the difference is, and George talked about this, second, third, fourth, fifth generation kids coming into higher education are heavily, don't kid yourself, they are scaffolded by the stories of the parents and grandparents and friends and family, all of whom know, they have the data. They have the idea about how you succeed in this higher education game. Our challenge is we are flooded in the world of higher education with first generation low income students who have no scaffold, zero. And it is in many ways ethically imperative upon us to get and wrap resources around them. And if we can use these stories of what works and what doesn't work and the stories of students past to help them be more successful, I just think it's gonna be powerful. And I think what I hear from students all the time and you mentioned this is students saying, I'm okay with using my data as long as you use my data to help me. Because right now we're using their data to do reports as opposed to using their data to help them make a decision which leads to two imperatives. One is, and you mentioned this, get it to them. Let's get the data, you're talking, get the data to the student, to help in the faculty to help them make better choices in the moment. And I think our bias has got to be to get the data to the front line and let them interact with this. It's like how many of you are wearing Fitbits right now? The health data is now in our hands where we actually have more of this data which I think is gonna make it more powerful. The second thing is the realization that the worst end of this would be is if it turns into let's get them to follow our directionism, right? If we start losing student agency and it's just them doing what we tell them to do because our data says it, that's the worst version of this. The best version of this is, remember the metaphor of a scaffold. A scaffold's there in the beginning and it begins to go away. So one of the things we should be thinking about is how do we do this work in a way where we can figure out how do we help a student in the beginning but then start pulling that away so they can develop more and more agency on the pathway? Because this can't just be about having them follow our directions and that's where the art of this is gonna come in. It's a lot of us experiment at the highest level and you've got this in the deeper learning analytics spaces. How do you actually take those kind of scaffolds away for that person to drive more and more of their agency going forward? I think the privacy thing is a really big deal. Are we gonna have bad actors? Absolutely. I think the commoditization of the student data for selling a profit's gonna be a challenge. I think what we've gotta think about is if we're using it to help the students succeed and help the institution make itself better, that's the gold standard. Great. Well, thank you George, Mark and Sylvia for this very rich conversation. So now it's UL's turn to ask questions. We have about 20 minutes for Q and A and if you have a question just raise your hand and a mic will find itself to you. We have a question. Hi, I'm a retiree and I'm pretty naive about a lot of this stuff, although I have been helping in an AP statistics course in high school and I'm struck by the automation that's going on in this course, much of which is invisible to me. I am curious about a couple things. Are these tools that are being developed? One question was what's the difference between predictive analytics and statistical forecasting? Are they identical or is there something different about it? I hear a lot about this in the statistical society and with certain degrees of dubiousness and are you using micro data of various sorts within the classroom and how do you collect this to answer, I just need some examples of what you're doing and talking at a high level. First question, yes, in many ways prediction and prediction activity is largely statistical work. I think there are a number of initiatives now that are trying to become more sophisticated. I think machine learning impact in education which is sort of trying to get more sophisticated in the analysis work. We'll start to see that developing certainly over the next while, but generally the scope of the data and the fact that you're dealing with multiple data sets is not just a single or one single data set that you're looking at starts to provide us with a richer sense of what happens in social settings. But to get to your question in many ways it is very much related to statistical forecasting. The other aspect is what does it look like in a classroom level? So there's a number of opportunities on this. So let's say you have a student that's taking a blended course and one of the things they do is you've got your student information system which will provide some type of a student model based on the background variables we talked about. Once they're in a classroom though it can be any number of things. Let's say you've got an LMS that has an e-book attached to it. You can look at learning management system, Blackboard, Canvas or any of those. Student logs on. At timestamps when they log on, at timestamps what they click, where do they go and what do they do? If they open up an e-book reader and then they make some annotations on it we can look at what they highlight, we can look at what they type to themselves for notes as they're doing any annotation work on that. If they then go and watch a video we can see when they click a video to start watching it, do they rewind? When do they return back to videos? How frequently did they return to it? Those are all data points that we collect on that. So it's really some combination of log data, click stream data, student information system data. If students want you can incorporate any amount of social media data. You can also look at reflective exercises that are built in for data elicitation. So what I mean is you might have just a natural language analysis of the sentiment and tone of students self-reflection that might give you an indication of how they feel about the course or how they experience the course. So there's a huge number of factors that we look at. Right now we're doing some work around confusion with students or where students disengage from content. We're looking at, as I mentioned, psychophysiology around the data element. So students were at Empatica E4 which is a very expensive wearable. It collects heart rate variability so we're looking at the intervals of data to determine student engagement. You don't want a student that's too excited because they'll end up dropping out. If a student is, and that's literally an issue, or students who are near comatose, they also won't succeed. So there is a stage of optimal engagement as determined by heart rate variability that we then try and match learning design practices to say if we do this activity in this setting this is the outcome. Now I'm just trying to answer your question of the specific level of data. Those are some of the sources that we're looking at. Oh absolutely, I mean you're looking at. No, I mean everything is, these days, what did the student take out of the library? When did they visit student services? If they go into a classroom, a faculty might have a student success system, they might add notes about certain students that they've interacted with that can capture some of that. It might also capture things that discussion patterns in a classroom. If we've been, one of the things we've been working on is for research purposes only which is the Latin equivalent of saying we just wanna analyze everything they do. But so a fully architected classroom where you do see seeding patterns and you can for example use tool sets for facial recognition to try and indicate confusion or states of frustration. It'll automatically recognize students in those spaces. It'll let you know how groups form and reform and so on. So but in a classroom that's not the norm, that kind of analysis work but certainly an instructor taking attendance is an illustration of that. I mean the list goes on and what you do at a classroom level but everything's data these days. And this is the advanced learning analytics work. The kind of systems that we're doing, we literally will give faculty basically a heat map of engagement to show them a combined engagement score and they can click on a student to get a specific student profile. And one of the things we found in working with faculty is, I wrote a book seven, eight years ago called Practical Magic which is a study of about 3,000 teaching excellence award winners. And one of the things you found, we found is great faculty members have always used data. They use all kinds of data. They use everything from formative and summative assessments, projects. They use writing. They use eye contact, body language. And they all kinds of data comes in. And what we've seen is if you can get these kinds of new data to the faculty, they add it to the mix. And when they add it to the mix, they actually just work their art even more effectively. And so for example, what we've found is faculty are seeing students who are getting solid A's who are not engaged. They're bored. And when they find that out, they actually start teaming them with other students to try to figure out how they can, they can serve as a near peer within the course. And you actually get a bunch of engagement that way. And again, they can begin trying to see, they see students who are, they're at a high B with just a little bit more effort could be in the A. And that's a perfect student to just kind of give them a nudge to get them over the edge. And again, the great thing about this is faculty, I think if you can give them the data and say, start working your art, they can try and test things in their own environment. And that's where it's gonna get even more. I think that's what it gets more exciting. I actually think we have infantilized faculty in higher education. We treat them like kids. I think if this is a way where we can open up and give them their data back and let them work their art. Can you imagine a medical doctor having to work in the environments that most faculty have to work with with their classrooms? I would argue most faculty, not in graduate school, but in undergraduate faculty are like battlefield surgeons. They're highly trained professionals in low information environments who hardly ever see the results of their work, right? It's true. And this is actually giving them the chart and giving them the ability to actually do some work and see what actually happens on the other side, which could be kind of encouraging and inspiring. The only thing I would add to that is that from a student perspective, what they see is actually a lot more human than you would think. They don't see graphs and charts and numbers and facts and figures. They're working through a test or at the end of the test or at the end of the section of a test, it'll be a message that looks like it's coming directly from their instructor. That'll say something to the effect of great job. You really understood these concepts. Here's some more material that you may want to take a look at. And it doesn't look like a machine and it isn't kind of the nuts and bolts of it that they're seeing in the background there. So it, yeah, it's not how. Any other questions? Hi, thank you guys so much. So I'm from an organization called Equity and we work with high school students and parents around financial planning. And so just in listening to a lot of the interventions and flags that you guys are talking about, a lot of it's centering around just academic performance. You guys touched a little bit on some of the sort of psychosocial elements, but I'm curious what you see in terms of people's finances, affordability generally and day-to-day spending, even, you know, working too many hours, those types of indicators and any interventions that you guys are seeing that are promising on that front. I wanna make crystal clear. We are actually trying to catalyze the conversations with people out of the academic discussion. We think that there is a significant portion of students who are at being challenged in higher education for extra academic issues, right? For the psychosocial life and logistic, whatever it is. And so for example, we've actually just did the Mindset Nudging program at Lone Star College in Texas where they actually use the model to identify high-performing students who are at high risk of leaving and they just, they started sending mindset nudges to them that basically said simple messages that said something like, congratulations, you've been so successful here at Lone Star, we're proud of you, but we know college is hard. Here are some of the issues our students are facing. Financial challenges, we're seeing students who are having problems with their fasts or their Pell Grants. We're seeing students who are in problems with childcare. We're seeing students who are having problems with making their work life balance work. If you're experiencing any of these, make sure you reach out to us so we can help you. And we had a student at our summit, you know this like six, eight weeks ago, where the student was just like, I was just about to quit when I got that email. And then I said, oh, and I called and I found out I was gonna lose my financial aid, this counselor was able to help me find the right grant package and I was able to stay. That's the kind of thing we're seeing. And we're also seeing, again, the simplistic flagging I'm really worried about. I think it's seeing indicators with precision support, you can see indicators like, this student is a rock star student or this student is a challenge student, whatever it is, but their engagement is going down, right? They started strong, but their pattern is now, they were learning this way and suddenly they're just not accessing the LMS, they're not accessing the course materials, they're not reaching out to the faculty members, something's happening. And when you can pinpoint that in a week as opposed to in a semester, right? You can pinpoint it at the moment the challenge is and you again, you find the right student at the right time and you can start testing the kind of outreach that'll help that student. So I think you're exactly right. I think we've got to make sure we're reaching out to the students because clearly, I mean, when you've got a significant percentage of our students, especially in community colleges and access universities that are homeless, food insecure, I mean, 45% of them aren't buying textbooks because they can't afford the textbooks. One of our initiatives is we have our college scheduler actually shows the textbook options for the courses. So students choose the course based on whether it has open education resources. So they can choose a course versus another course because they're using OER, that's great. Because then a student can vote with their feet, right? About what the resource is gonna be. But that's exactly right. We've got to be willing to engage that conversation and get out of the simplistic. Oh, they're below 2.0 or they failed a geek keeper course or whatever it is. Of course that's single, but that's a small segment of the single. Yep. Lizzie from the National Science Foundation. So we need to talk because evidently you've had some awards from us at Arlington. So we know of a couple of them. This is a question really about data sharing. And the context that I'm thinking about is the fact that so many faculty now are adjuncts. And so that's a huge problem in and of itself. But then you have a lot of students who are transferring amongst schools. So what you've described are instances where you have a lot of data that's being accumulated at an institution. But it seems to me that the real power is if those institutions could share that data. Because now you have access to this potentially enormous corpus against which you could do research, against which you could help the recalcitrant faculty member see the light, if you will. But what are the incentives for data sharing? And is there a federal role to try to promote standards for data sharing? Not to set them, but to promote community driven standards. I'll answer that as a researcher. And the short view is there's a lot of value in collaborating. That's one of the benefits of working with let's say MOOCs or MOOC providers. So we just finished a paper, Ryan Baker from University of Penn and one of the doctoral students that we supervised that looked at replication of existing findings out of MOOCs and the MOOC publications. We were particularly trying to look at replicating 21 specific results that had been produced. To be able to do that against a large edX data set is really a good opportunity. It's one that we haven't had in the past. And that's only one level or one type of data coming out of a system like edX. If we were able to, and I know some of the work, I know you guys funded Ken Catinger's work as well out of CMU and he's working on LearnSphere. There has been work at MIT on MOOCDB. So with Ryan and a few other folks we've been looking at, what do we need to share around educational data that we can treat collaboratively as research? I think there's value in a partner, let's say like Civitas, who has a large amount of data that is multi-institutional data for us to be able to have access to that as researchers to even have access to part of that could provide really informative outputs for researchers. We actually are really lucky in North Carolina. Again, we have 58 institutions in our community college system. So we do have access to a lot of data at a state level across institutions that we could be using better. I think that we have a lot going on now in terms of pathway initiatives that are gonna allow us to think about that a little bit deeper and look at that information closer. But a lot of it does come back down to capacity. We don't always have the capacity to do that kind of work. And we do have to rely a little bit more on kind of the national players who are able to do that from a research perspective. And we've been deep into this conversation around ecosystem analytics and we couldn't agree with you more. We've actually just, our last community insights report, we just did an analysis and actually showed that high school data is always as and at least 50% of the time is more predictive for incoming students than any of the SAT, ACT data. Simple things like their high school GPA, high school class percentile. It's just amazing how high school data is incredibly important to be able to think about how you're gonna take them in as a community college or as an access university. But especially the transfer relationship. There's this assumption that if a student transfers, it's good and we've actually seen that that's absolutely not true at all. And we've actually seen this transfer tipping point where a student transfers basically above the associate's degree. They transfer after an associate's degree. If they finish and then go further, they're significantly more likely to be successful. And one of the greatest things has been our universities has been looking at that and going, wow, it's penny wise and dollar dumb to poach. We should be figuring out how we incent students to finish at the community college before they transfer over. I'm leaving today to go tomorrow to a meeting where two institutions have signed their first data sharing agreement and are starting this process to align. We have the folks at Cleveland State University and Lorraine Community College and Tri-City that's also doing this work. And this whole idea is to federate the data between the institutions to begin to look at this analysis because what you see is that, and by the way, that's where transparency comes in. One of the most effective nudges we've seen is telling a student, oh, I'm so excited you're ready to transfer. But did you know students who finish graduated at 65% higher rate? And the student goes, ooh, so let's figure out how we help you get there. And then, oh, by the way, there's a scholarship available for students who come in with an associate's degree. I think that that kind of K20, P20, P to career is probably the next phase of this work. Phase one is turning your own lights on, right? Phase two is then connecting that pathway because most of especially our low income students are multi-institutional. They're going between institutions on their higher education journey and we've gotta be willing to kind of show that journey. I mean, the National Student Clearinghouse is a great resource in that process where we're actually gonna be integrating that into our models, but I think that's where that kind of is, inter-institutional play, especially low income diverse students who wanna go on math and science pathways, that this is, we have to get this right if we're gonna do it right. Sure, that will be the last question. I have no microphone. So the question is, how do we get there, right? What can we do? If we believe that data sharing is important, and of course that calls to mind the great, the joke that says the great thing about standards is there's so many to choose from. Because data sharing and aggregation means you have to, my data has to be able to talk to your data. Well, it only does that if there's some interoperability. So how do we get to that, if that's the end goal, can we start now and what is it that we can do now to pave the way for that? I'll just throw a couple of ideas out there. I'm a big believer in having policy as an enabling mechanism, right, instead of a regulatory mechanism. I think, so for example, we put out close to $2 billion in tax grants in the last six, seven years. And they actually built stackability into that for credentials, which I thought was really important. But imagine if you started seeing incentive programs coming out of the Department of Ed, Department of Labor, and within states that actually required interinstitutional cooperation in data sharing, right? We actually built that in as a requirement. To get this grant, you and a partner institution actually have to agree to share your data and make that data transparent between the institution to do that. The more we can build that into, especially incentive base, where you can say you can participate in this grant program, you can participate in this kind of process by doing it, I think then you're also going to get, then you're going to start getting the good actor models where you're able to show what people do with this stuff and then you'll start getting people as saying, well, we should be doing that too. What I worry about is if we suddenly start coming down and pounding on people, once you start doing it as accountability analytics, people lock down, right? And you get political, really kind of really nasty things happening, but if you can make it on the incentive side, I think you can get some real action happening. Can I also say, look at, there's a conversation the Gates Foundation recently had on trying to drive this as well, because there is, there needs to be an incentivized data sharing strategy that's available for individuals to participate in state level and nationally, but UK has done a good job of creating an infrastructure for that kind of data as well. Any of the OECD data sets that have been generated for that. So there certainly is a role for government to play in making that happen at a policy level. From a research end, there's huge value in being able to have some kind of that data readily accessible. And whether it's, say, an NSF or a Gates Foundation or a few foundations with NSF pulling some of that together, it would definitely provide value from a research end, no doubt from an institutional performance end as well. Well, unfortunately, that's all the time we have for today. Thank you all for joining with us in this conversation. You can continue the discussion on Twitter using the hashtag data ethics. We probably should have told you that at the beginning. But this concludes today's event on predictive analytics and ethical use. And if we could, a round of applause for our wonderful panel.