 Hi, welcome back. It's great to be here. My name's Emma Proctor-Legg and I'm the chair for this session. I'm really pleased to be joined today by Scott Bartholomew and Matt Wingfield. This session is entitled Learning by Evaluating Student-Led Formative Assessment. The session is 25 minutes and we're happy to take questions throughout. So please do use the chat to post questions and comments. I will now hand you over to Scott. Welcome, everyone. It's good to be joining you. I'm joining you from Utah in the United States of America. So just across the pond, but I'm very honored to be here. And I'd like to take a minute to introduce myself and then my colleague, Dr. Nathan Mencer. And then I will also let Matt Wingfield introduce himself. So I am a professor of technology and engineering studies at Brigham Young University. So primarily my role is in teacher education, training, design and technology teachers, or as we call them here in the United States, technology and engineering teachers. And I've been at Brigham Young University for one year now. Prior to that, I was at Purdue University for four years as a professor in similar department there. And before that, I was actually a middle school teacher teaching design and technology classes for three years and also taught high school as well. So my background is really in education and teaching. And that's where my passion lies in working with teachers and in classrooms. I want to introduce Dr. Nathan Mencer as well. He's unable to join us this morning, but he is a professor of technology and engineering education at Purdue University. He works in teacher training. Dr. Mencer is heavily involved in research and graduate student advisement. He has been a professor for more than 10 years at Purdue University. And his background also lies in education in that he was a high school teacher before he became a professor at Purdue University. And so both of us have that K-12 background and experience and that very much shapes everything that we do. I want to turn the time over to Matt Wingfield and let him introduce himself. And then after he's done, I'll go ahead and take us through some slides. Great, thanks Scott and hello everybody. My name's Matt Wingfield. I just to complete the set on the K-12 teaching front, I used to be a primary school teacher, but I've been in the education technology space now for about 25 years. I work as a freelance consultant, supporting technology providers, exam awarding bodies, universities, colleges in their use of educational technology. And I'm currently doing a piece of work with a company called RM, who's involved in the delivery of this project. And I'll talk a little bit more about them and how we've been involved a little bit later on. And then in my spare time, I'm also a chief executive of the assessment association, which is a not-for-profit organization to support and propagate the use of technology in and around formative and summative assessment. So that's me. I'll pass back to Scott. Perfect, thanks Matt. So let me talk a little bit about our agenda with the things we hope to cover today. We did introductions and then I want to talk a little bit about a scenario that we found ourselves in. In the spring of 2019 at Purdue University and what the challenge was on our hypothesis. I'll talk a little bit about the approach we took as we tried to tackle this challenge, the outcome and some of the next steps. So let's start with the challenge and the hypothesis. So myself and Dr. Mencer both taught a introductory design thinking class to freshmen at Purdue University. So all the freshmen in our college had to take this course. And as colleagues, we talked quite frequently about this course and how we can improve it and things we could do. And one of the questions that we tackled is the one that's here on the screen. How could we facilitate impactful formative assessment experiences that were scalable across the classes but didn't negatively impact the teachers? Oftentimes we found that we could do things that were helpful to students but they were very, very taxing for the teachers or the alternate would be we do something that was maybe easy for the teachers but it really didn't have this impact on the students. We wanted to empower our students to learn independently. We wanted them to know what good looks like. So in our field, we work in the design space which is a very, very muddy area. There's not a clear right and wrong answer. Usually there's lots of potentially right answers. We're tackling these ill-defined problems where we don't know what the right answer is and we really just want students to be very creative but we also can recognize as the teachers that there are better answers. So some students might propose a solution that's better than their peers and we wanted our students to be able to come up with that on their own to be able to improve their own learning by understanding what good looks like. So here's the approach we took. And I just wanna point out that because we were dealing with such a big group of students between five and 600 students, this required a partnership approach and I'll talk a little bit about this partnership that we have later on and I'll kind of circle back around to it but here's what we did. One of the assignments that our students had the hardest time with is called a point of view statement. And a point of view statement is the students identify a user, so a specific person that they're trying to design for, then they identify their need, what is the problem that this user has and then they identify a particularly interesting insight into that need that might shape their design thinking and their approach. So just an example, maybe the user is a university student and maybe their need is that they're having a really hard time finding parking and maybe the insightful approach is that there's ample parking on campus but they're not willing to pay for it or there's ample parking on certain parts of campus and not others. And so the students craft these point of view statements and that becomes the crux of their designing later on and what we discovered is that really this seemed to be the point where things were breaking down, where the students were having the hardest time and so we wanted to make this the focus of our efforts in this project. So what we did is we took all the students enrolled in this course between 550 and 600 and we split them into two equivalent groups. The way we did that is we split it at the teacher level. So every teacher of this course had two classes. One class was gonna receive the intervention and one class would not receive the intervention. There would be a control group and an experimental group and our intervention was very, very simple, it was very, very small. It was a around 10 to 20 minute experience where students would use a piece of software called RM Compare that I'll talk about and in this software they would have an opportunity to look at different items, to evaluate them side by side, to choose the one they thought was better and then they would view another pair of items. Now the items that they were viewing was point of view statements from last year's cohort of students and we hypothesized if we show them last year's work and we have them do this evaluation side by side, they're going to do better than their peers that don't give this opportunity. Even though this intervention was very, very short, like I said, 10 to 20 minutes, some students took the full 20 minutes, most students took only around 10 minutes and they simply sat down, viewed the items and then went on and created the point of view statements and then our plan was at the end, we took all the point of view statements from both the control and the experimental group and we had the teachers assess them and then we ran some statistical tests to see if there's a difference, meaning the students that had this opportunity did they do better than their peers or were we still equivalent? So let me pause and take a little bit of a break here to talk about the intervention or the opportunity where they were looking at point of view statements. The idea behind this really rests on the law of comparative judgment and the law of comparative judgment has been around for a really long time but the basic idea is this, people are better at making decisions between items or between two things than they are at making decisions just subjectively. For example, when you go to the eye doctor, if you have glasses, the eye doctor is gonna show you two different things and say which one is better. Rather than saying, on the wall, I have a poster, go choose the one that's the right one for your prescription. If we were asked to do that, that would be very, very difficult but when we compare item side by side, it becomes much simpler and that is the underlying principle in RM Compare which is the software that we used and so rather than showing students point of view statements from a previous year and saying how good is this, we showed them pairs of statements and said choose the one that's better and we found that this is a much easier way for students to learn and to develop a nose or to develop an idea of what good looks like because as they go through, they look at one and they look at the other and they say, oh, which one is better? Well, this one is better. Well, why is it better? And then as they have to think about why it's better, they start to articulate that and they start to really internalize those ideas. Okay, and the student experience looked just like this. Okay, all they did was they saw two POV statements side by side. They would read them both and they would say, okay, which one's better, A or B? We also prompted them and asked them to provide some sort of comment or rationale. You know, why did you choose A over B or B over A? What was it that made this thing stick out to you? And that was intentional because we wanted the students to articulate why they were choosing one item over another, okay? Once again, very, very short. We're talking 10 minutes of time, but that was a key part of this experience. So what'd we find? We had the control group students do class as normal, experimental group students did class as normal with a very small time in intervention. Here's what we found, statistically speaking, and then we'll talk practicalities after that. There was a significant difference between the learning by evaluating, which is the term that we've kind of coined around this experience. There was a significant difference between the learning by evaluating students and the control group, okay? And you can see the stats there, the T test. Seven of the top 10 students, okay? We're from the group that had this experience where they got to look at items side by side, the learning by evaluating experience. Seven of the top 10 were from that group. Also, some might say, well, there's a statistical difference, but what's the practical difference? Practically speaking, you can see I've got Cohen's D there, which is a term of practical significance, but two thirds of the control group that didn't have the intervention were below the average person from the RM compare group, those that had this opportunity to evaluate. So you can see that there's some practical significance there as far as boosting student achievement and helping them to do better, okay? So we were really excited about this, something as simple as 10 minutes of time, had a significant impact on our students, statistically and practically. And so then we said, okay, well, what's next? Well, what are the next steps? So the next steps were, we've started implementing this in our classes at Purdue, it's become part of the standard course experience, students have this, but then we also wanted to take it and make it bigger. So we applied to the National Science Foundation and we applied for what's called the Discovery Research K-12 grant. And this grant was for to do a three-year study of taking this small, very small intervention and putting it into schools in the Cab County School District, which is located in Atlanta, Georgia. One of the biggest school districts in Georgia, very, very large district. And we wanted to explore if, well, what if we take this approach and we use this with high school students, grade nine, 14 year old students and what's gonna happen? Could this be really big? We received the grant for one and a half million dollars, which will allow us to involve a lot of students across a lot of schools and to collect a whole bunch of data to now see what are the potential possibilities for putting this widespread into a school district and the classroom schools across a county. And then hopefully, moving outward into a state and your region and the nationally to see what the opportunities are. I'm gonna turn the time over to Matt to talk a little bit about this slide here, which is another aspect of this experience and this software. This wasn't something that we used in this study, but this is something that we're dabbling with as we experiment down in the Cab County. So Matt, I'll pass the baton over to you. Thanks, Scott. Yeah, one of the challenges, of course, in taking this into the wild, if you like, and into a non-research scenario is being able to understand without having a control group of students who don't experience from it what the impact is. And which of the students are finer and have got this concept through this learning intervention and those who haven't and thus might need a little bit extra support. And one of the nice things that the software does in the background is it monitors statistically how the students are interacting with their comparisons and the judgments that they're making as they're choosing, in Scott's case, the better point of view statements. And it maps that out graphically for the teacher, which is what you can see on the screen here so that you can understand which students are sort of in kilter and gets, get the concepts that they've been trying to get their heads around and those that don't. And the student that you can see, so the blue dots are the students and most of the students are gathered in that sort of white band from 0.05 to 1.5. And that's the consensus group. And all of those students are pretty much in agreement with each other and understand the concept that they're studying. The student that's shown in the blue band is slightly different. And what the system is doing is statistically calling that student out and saying to the teacher, you might want to spend a bit of time with them just to make sure they understand what's going on. Because as I was saying, you can't have a control group in a normal circumstance when you're teaching. This is a way that the software additionally helps you to understand where students get the problem and which need a little more help to get their head around what good looks like in this context. I think the other thing that's just worth briefly touching on at this point is that also in a normal sort of classroom scenario, it may be more difficult to show students just good pieces of work in Scott's first experiment that he just talked about. They used good point of view statements as the exemplar work that they showed to students to help them understand what good looks like. But he then ran another project the following year where they used a mixture of good, indifferent and not so good pieces of work. And Scott, you found exactly the same effect, didn't you? That there was still a significant positive impact on the attainment of those students who were exposed to those scripts in this particular way as well. Yeah, so what we found is that students have the opportunity to go through this learning by evaluating, it becomes the chance for them to really internalize not only what good looks like, but also what we found is also the assignment itself. Sometimes students, as teachers, we give students an assignment, but they really have a hard time getting their head around, well, what is it that the teacher is wanting me to do? What are they kind of expecting? Students don't know where the bar is set. And this opportunity to compare items side by side helps them to start to internally establish, like, okay, well, this is where the bar is set and this is kind of what we're expecting and this is what my teachers is hoping to get from me. And so yeah, that's one of the great things that we found is this opportunity for students to take it and really be able to internalize that and understand it, so. Yeah, fantastic, thank you. Could we move on to the next slide, please? And then the next one again as well. I thought what might also be useful is just to understand where comparative judgment has been used, because it's not, as Scott was saying, the theory behind comparative judgment is it goes back to the 1920s when a guy called Louis Thurstone came up with this law of comparative judgment. But as a piece of software, we've been using this software-based comparative judgment approach or adaptive comparative judgment approach since actually since 2008 and there was a research project initiated by Goldsmiths, part of the University of London, and then in the wild in projects from 2012. And I won't go through all of them there, you can see the various organizations that have been involved. But I think it's worth calling out the two UK universities who are using this quite prolifically now to support this formative approach. It's not exactly the same as what Scott's described, they've been using it more for peer assessment, so using their own work and having that peer assessed and getting the students to provide feedback on that peer assessed work. And that's the University of Manchester who have now been using the software now for about six years. And Queen Mary University of London have been using it since last year, but are now set, they've just taken a new license for the product, using it with over 2,000 students in their School of Biological and Behavioral Sciences. Again, to support this formative peer assessment approach to eucalyse a better understanding of what good looks like and to capture and provide feedback to the students. So what I thought might also be useful, and we're gonna test the technology a little bit here, I thought it might be useful just to hear very quickly what a student thinks about this approach. So Scott, if we can go to the next slide, and if you can click plays, if you hover your mouse over the slide, hopefully the audio will come through. And it's not going to. So nevermind, thank you, Scott. I apologize Matt, sorry. Sorry, we should have tested it. We'll share the link to that video so that you can get some insight. So this particular student was from the University of Edinburgh where they were using this comparative judgment approach in order to facilitate this formative peer review and feedback type approach. And one of the comments that Brianna, the student makes is how it gave her such clear insight into what her tutors wanted her to do. So this sort of goes back to what Scott was saying earlier on about how the approach helps them to understand what they need to do, not only what good looks like, but what's expected of them in the particular context that they have, that they're being faced with. And I know that, as I say, if you get a chance to watch the video, it lasts for two minutes and we'll share the link. It's really insightful to understand what Brianna is saying around that because it's clearly quite revelational for her as a student at the university. And she makes the comment, she's a final year student, that she wished she'd had that experience earlier in her degree because then she'd have known how to write better essays and how to write better pieces of work. I think the last points I wanted to make before we turn over to questions is that the piece of software that we're talking about here, RM Compare, it's not the only piece of comparative judgment software out there. There are a number of other commercially available comparative judgment pieces of software. There are some that have come from university, so the University of Antwerp through a research project developed a piece of comparative judgment software, which it's now made commercially available to anyone who wants to use it. The University of British Columbia in Canada, likewise has created another piece of software. So it's worth just mentioning that there are other bits of software there as well. And Scott, I don't know whether you had any thoughts on why you chose to use RM Compare, because that might be something that people would be interested in hearing about. Yeah, that's a great question. And I've used all of them and they all function great as far as engaging students. My experience has been that at least so far working with teachers that RM Compare is the most approachable as far as ease of use, ease of understanding the different reports and statistical things that are come out of it. The interface is easy and intuitive for the students to use. And so, yeah, there's definitely lots of options out there. That's the reason that I ended up using the RM Compare software. It's just because it was a little bit more approachable for the teachers and the students to use. But the key is facilitating these judgments and making it such that it's not only is it a learning benefit to the students, but it's also not so difficult for the teachers that they choose not to use it. Because it's just, oh, well, yeah, I could do that, but it's just too much work or too much effort or something like that. Yeah, no, that's great. Thank you. I think if you just click on to the last slide, a number of links here, which will also pop into the chat so that you've got these more easily to hand. Scott's gonna be running an interactive webinar on the 30th of September. There's a link there to register. The big difference, I think, between what we've just shared with you and that session is that you'll be able to have a go and try it out as well. It was a bit difficult to arrange that through the delivery of the platform that we're using here today. There's a link there to the full research paper. So by all means, have a read of that if that would be of interest. A link to an overview video there. It's also worth noting that you can access the RM Compare software for free. It's a fully functional free version that you can use with up to three sessions, judgment sessions. So please feel free to try it and see what you think. And if you've got any questions that you want to ask us that you didn't get a chance to put in the chat, then you've got our email addresses there as well. And I noticed a comment come through from Anna, saying she's looking forward to the webinar. Thanks, Anna, for that. It'd be great to have you there. Please register for the session. And I think with that, Emma, we'll hand back to you. Thank you very much. Thank you. It was a really excellent session. Really interested to hear all about it. It seems like it's a really great piece of technology to use. And I'm already, my brain is going, oh, I could use it to do this sort of thing. So thank you so much for such an interesting session. It was really great. Thank you. Fantastic. Thank you very much.