 So I'm going to talk about some lessons we've learned from looking at data with data scientists and with educational researchers. So the kind of things that I think we are likely to see in the future with Inspire and as more institutions adopt learning record stores as well as we have more broad adoption of XAPI. Here's some lessons, some I think early indications of things we can learn. So I'm first going to give them, I'll do this in kind of three parts. So first give an overview of the kind of learning analytics, how we position that, how I position that and some of the work we do, I'll do that very briefly and then talk about some findings that we have from research. I'll talk about two examples. One a campus based example that's one of the customers, the institutions that's using X-ray learning analytics that we have and I'll show you the results of the predictive modeling and how that's worked in the wild and some of the accuracy and the features that matter. And then second I'll look at a large scale study that shows why some of this gets a little bit more complex and some of the difference by tool use, by the different tools that are used in the VLE and how looking at differences in student performance, these kind of help you understand that and then hopefully have time for questions. So I think of educational technology assessment, I've been thinking about learning analytics for a while in the ways that we collect evidence to evaluate whether the things we do make a difference or not and I think there's kind of a three tiered approach I would suggest. You know one and I think this is as we evolve and develop as a community, as educational technologists, academic technologists, we progress through these kind of three levels. I think the first is, does it work? Does the server turn on? Right? You know are the disks spinning or the applications working? Those are things we determine through service level agreements or other kind of basic technology monitoring. Because if the server doesn't work, right, if your database is broken, then nothing else matters. Nobody will, nobody will kind of get use out of it. Then the second is looking at how many people use it. What's the frequency of use? What percentage of your tutors are using? Your students are using the VLE and what percentage of courses? That's another thing that also matters, right, looking at that level of adoption. And then the third is what matters at the end of the day is not whether it works or whether it's being used, does it make a difference in student learning? And that's exactly where I think that learning analytics is located. And it's always nice to be at the apex of the pyramid, right? So I think that's one of the most important things at the end that we're really trying to understand and assess. So this is a paradigmatic definition of learning analytics. Gavin also spoke to this I think this morning that distinguishes learning analytics from administrative or academic analytics that we're looking at learners in their contexts for the purposes of better understanding and then optimizing or improving those learning experiences. So this is some of the things that we are doing that we look at at Black Ward and we have a data science team. And so some of the key data sources we are looking at is data from Moodle rooms, data from X-ray customers as well and modeling that in collaboration with those customers and learn data as well, our other VLE that we have and that which is on both. We have different versions of that and different hosting environments, which of course does make a difference. And then we're also looking at our web conferencing platform collaborate and using data and large scale data analysis on that. And just kind of in terms of what are you doing? There's three ways, three general techniques we're applying. The first is simulation, right? So if X, then what Y would happen? So we can look at things and we're looking at things like, you know, if we were to provide, we have rule-based triggers, if we were to create a rule, I think someone mentioned this morning for the top 5% of students by grade or by activity and give them a notification, well, we're running simulations to see so how many students would that notify? And would that be too many notifications, do we think, or too little? So we can use that to better understand and to do kind of contemporary data-driven analytics for software development. The second is hypothesis testing. And this is one that I think is most relevant to learning analytics. So things like what's the relationship between the amount of time students spend in a course and their grade, something that's assumed at the basis of a lot of learning analytics that we do, event-based monitoring and triggering, but we can look at that in detail and see how strong that relationship is and if it's something that we should investigate or pursue. And then the third kind of the most exciting perhaps is data mining, where we don't have a hypothesis that drives us. We simply look for kind of patterns and relationships that might bubble up, right? And so this is one that after the fact you look at and maybe it provides you meaningful information or maybe it's just a pattern that indicates something that instructor needs to do. All right. So critical to us is a commitment to privacy and openness. So we analyze data that's removed of personally identifiable information and also depersonalized both at the individual and institutional levels. Important being here, being in the UK, that we very much understand and have awareness of the different laws that apply throughout the world. They're quite complex in some cases and we have our analysis is conducted accordingly with those. And then the third, and I think it particularly relevant for being at the MOOC, is that we have a very strong commitment to sharing the results of our analysis. We don't keep it as some kind of secret sauce that we think needs to be kind of hidden away and used for commercial purposes. It's something that we take forward and we share the results, internal research that we conduct, as well as research with customers, of course, with their permission. So that's some of what I'll do today. And I think that we really see this as there is so much knowledge out there. There are so many research results that we hope to contribute to that as a member of the educational community. Not something that's always been Blackboard's reputation, but something that we are very committed to with our current leadership. All right. So with that, I'm going to turn to some research results and findings. So virtual South Carolina online is a customer using the X-ray learning analytics product. They're a very large secondary school in the southeast of the United States. They have a program. They're a Moodle Room's customer and they're using X-ray. And they have an interesting program where they offer courses that are support courses to any high school student in the state of South Carolina. So they're not actually a school. They're a department. They're a part of the Department of Education that offers courses to help improve or augment the on-ground student experiences within their secondary schools. I think they have around 35,000 unduplicated headcount per year. So around 35,000 students take one or more courses from them every year. And they have a very strong, robust use of Moodle. So they came to us looking to use X-ray to identify students who are at risk of dropping out before they dropped out so they could do something about it. So hopefully this is legible on the screen. I'll have to read it, I think, a little bit. So the first thing is that we created a predictive model. And so, again, in that commitment to openness, these are the things that matter in creating that model. From top to bottom, right? So on the far left-hand side is a name of each of the features that we found through analyzing and mining their historical data matter. And then we have the predictions there. Prediction number one is the prediction of student passing a course. The second is being in an intermediate category, maybe yes, maybe no. And the third is a prediction of student failing a course. We're using a technique called random forest. If you're familiar with statistics, it's a contemporary technique. It's bound to be a bit more accurate than regression, which is a more common technique that's used by educational researchers. But it's a little hard to interpret the results. But these go from kind of top to bottom. A number of the features there. The top is the grade of quizzes. There's the grades over possible. So one of the things we find is looking at just the current student course grade is very important. But we also need measures that understand if a student has not taken all of the assignments as well. So we're looking at both of those. So if a student has a very high grade, but they've only taken one of multiple quizzes, that's not a very strong indicator. So we've got both of those. And then the third most important is time on task. So that's the third most. And I'd say generalizing, that's one of the things we are finding is that the time and the activities within the VLE are weaker indicators and are usually hitting around the third or fourth. We've put in course week is there, the fourth feature, fourth most important. That's where you're at in the course week. And so I've got five minutes. So I'm going to run through these time and days, weekly regularity, the number of words, the number of enrollments a student has concurrently within the VLE and then a number of other indicators, some of which get things that are a bit more sophisticated. We have a flesh concave measure, which is a measure of readability within the discussion forum posts. We're finding looking at the quality of interaction as often is now possible with modeling techniques that we have. And it's less important, but also something that is helpful for teachers to be able to see, to be able to make meaningful interactions with their students. So I think I'm going to skip after this because there's a whole another body of research I want to show you. So here is the predictive accuracy of that model. So you can see on the left, middle and right, there's a low students that were classified at low risk and then by week of the course. Those that are green are the ones that ended up passing the course. Those that are red are the ones that ended up failing the course. And so just briefly scanning this, you can kind of see that our models get more accurate over time, especially of students at risk of failing the course. We're generally finding somewhere between week two and week four to be the sweet spot where we really hit a higher level and a degree of accuracy. So I'm going to zoom fast past this to look at one other, a different body of research we've done, which is on generalized student use at scale between, and I think this may be very relevant to you, thinking about results at your institution, results between student use of the LMS and their final grade in a course. So we're developing a VLE platform, creating analytics we hope will be generalizable and generally useful. So the first question before doing that is, is there a relationship when you look at scale between use of the VLE and student grade in a course? And this is a scatterplot here of 1.2 million students, don't see that every day. It's something fun that we have the capability of doing. And it turns out there is a significant relationship between student use, but it's minute when you look at a large scale across multiple courses. Less than 1% of the variation in student grade is explained by their amount of time they spend. Again, we're generalizing here, as you can see, over about 35,000 courses. But we find that that result varies dramatically by course. So when we looked at this at the course level, we found that the range of that explanatory power was anywhere between hardly any to explaining over 50%, 60%. So the question is why, right? So why is there so much variation in that use? What helps us understand that and can make us help us better able to use that to provide features and learning analytics is meaningful to tutors, administrators, and hopefully students. So one of the things we found, we looked at this and separated it out by tool use. And I'll go very quickly here that this is a chart that shows looking at grades, that topmost curve, the topmost line are students who passed a course. The one in yellow are students that got an okay grade between 60 and 80%. And the bottom, the red line is students who failed the course. And we have quartiles of use along the bottom, so that's students who never looked at their grades. And then the bottom 25% of use, those that use it in less than average, 50%, 75, 100. What we found is that students who look at their grades more often are those most likely at every single quartile of use were more likely to get a good grade. Is this causal, right? Is looking at your grades make them increase, right? Of course not. My son would really like that to be true as a 14-year-old who likes to look at his grades, I can tell him study more. But the point here is that so if students look at their grades more, is it that awareness? Does that have an effect? And would it be a good idea to push grades in front of students, even if they weren't looking for them? We think so. And probably the most striking thing in my mind after looking at this is this is the only tool type in which we found this relationship. Other tools did not have the same thing. That frequency of use, here is course content. So looking at resources, looking at activities, web links, going from not looking at them at all to looking at them, cause an increase in likelihood to get a good grade. But after that it did not. And we can talk after the fact about why that's the case, right? And I have a number of hunches and intuitions around that. You probably have your own around that as well. But it helps us think about learning analytics that is not simply more as better. There's different ways to analyze and look based on that behavior that can help us make and provide deeper understandings and more meaningful inferences to students. So in assignments assessments, right, we found the inverse to be the case. Students who spend more than the average amount of time on quizzes, on assignments actually are more likely to get a lower grade than those that spend up to the average amount of time. Think to yourself in a kind of face to face environment, right? Students who are staying till the very end of the test are often not the ones who are diligent and understand everything. They're ones who are struggling with the material for perhaps a variety of reasons. So when we began disaggregating this by high and low amount of usage, and we actually found that this varies quite a bit. This gets to the point of by course design. So in courses that used a lot of discussion forums, that relationship was stronger than courses that had a low usage. So there's quite a bit of complexity when we kind of tease apart this data that we're able to do that we weren't before to give us more meaningful kind of understandings of what's happening within student experiences. All right, so some implications and things we're doing next and happy to take questions. Thank you, and there's my contact information.