 everybody to CMS colloquium. And as most of you probably know, but bears repeating for those guest attendees, that the way that we run the colloquium is that we have our graduate students and some other members of the MIT community on screen with our guest. And after the presentation, there'll be a Q&A in which folks on screen can directly ask questions, and those who are attendees offscreen can ask questions via the Q&A bar. So that is very much encouraged. So tonight, I am going to hand over to my colleague, Justin Reich, to introduce our guest, Joshua Litberg Tobias. Great. Thanks so much, Vivek, and thanks everybody for being here. So we make media as a society for many purposes. We make media to entertain. We make media to persuade. We make media to inform. And one of the things we do is make media to educate, to help people build new capacities that they didn't have before. And there are lots of disciplines that can contribute to the study of media that educates. And the learning sciences are kind of a bundle of those differences connected to psychology and cognitive science that are interested in particularly in sort of technology-inflected ways, like how do we create media that most effectively build new capacity? And so Josh and I work in a lab, the teaching systems lab, which is really interested in these questions of how do we help people learn better? Josh has specialties in measurement and evaluation. There are lots of different ways to improve media for education. And one of those ways to improve media for education is to figure out whether or not the things you're doing are working. Or if you say that you want to create media that helps teachers do a better job with anti-racist teaching, what does that even mean? How would you figure it out whether or not your media is doing the things that you wanted to do? And you could ask the same question of if you make media that helps kids learn to divide fractions or adults to be able to conjugate Spanish verbs, how do you know whether or not what you're doing is really working? How do you know whether or not it's working better than something else? So that's the expertise that Josh brings to our interdisciplinary lab. And we've been doing some really cool work together over the last few months to figure out how we can help support teachers doing a better job with equity teaching practices. And Josh has come up with some really innovative ways to help us figure out how we're doing using a combination of quantitative and qualitative and computational approaches. So I'm really excited to turn it over to Josh to let you all learn more about the work that he's doing with Marbez and many other folks in the lab. Oh, I wanted to say one other thing, which is that Josh has some job talks coming up for some faculty positions he's applying for. And this is a bit of a practice for one of them. And it's actually quite an interdisciplinary audience that this other talk that he's giving. So you all should feel very free to be very candid in your criticism of Josh's talk, both on the substance and on the delivery. If there's stuff that doesn't make sense or you think doesn't land well, it'd be very generous of you to let him know so he can improve on those things before the before the live event coming up. So over to you, Josh. Thanks, Justin. I appreciate the introduction. I just wanted to say before I jump into it that this is work that I've been really honored to be involved with and I've worked with a lot of really great people at the MIT systems lab, including Elizabeth Borman, who was a graduate student and CMS. Last year, Marbez, who I've worked with for a number of years, Chris Buttermur, who's a postdoc and many, many other people who've contributed to this research. And so a lot of what I'm doing, some of it, a lot of it, I couldn't have done it without the help and support and working with other people. So I'm going to share my screen and we will get started. So my talk today is on measuring equity promoting behaviors in digital teaching simulations. And I'm going to talk a little bit how I used topic modeling, which is a form of natural language processing, understand what's happening in the simulations. But before I kind of dive into the specifics that I did, I need to give you a little bit of background about teaching and education to give you kind of like the 3,000 foot view to sort of understand kind of what we did and why. So just kind of go over how the talk we structure. I'll start with some background about the topic and sort of why we looked at the things that we did. I'll talk about what our digital equity teaching simulations, there's a lot of words there to unpack. Then I'll talk about the specific analysis that I did using something called structural topic modeling. And I'll explain a little bit about how the kind of the inside the HUD, how that actually works in practice. And then I'll present some results from a analysis that I've worked on over the last couple months looking at a course that we administered last spring. And finally, I'll present some future directions about where I see this research going over time. So to start, just some background. So many of you have heard of the term the achievement gap. So this is a term that's very commonly used in education policy circles. And it refers to the difference in achievement between white and Asian students and black and Hispanic and Native American students. And so this is something that policymakers have been talking about, you know, for a long time, particularly since the 1980s, and there were all kinds of education or four efforts to close the achievement gap. And I know some of you may have heard of no child left behind, which was under the President Bush, then there was race to the top with Obama and the 2010s. And then there was the Erase Students of CZACs in 2014. So we've had all these iterations of the forum. But the thing is there hasn't been a change. So if you look at this graph that I'm presenting, you see that the achievement gap has pretty much remained constant over time since early 90s. So with all the educational reform doing, there hasn't actually been a change in the achievement gap. And particularly in the last few years, there's been a lot of criticism of even the term the achievement gap. There was a research that came out last summer, they actually showed that if you kind of frame, you show people videos about the achievement gap tends to be with negative stereotypes about African American students. And there is a lot of criticism about framing the achievement gaps, framing this thing in terms of a gap. And the reason for this is that by talking about the gap, many people attribute that to the actual characteristics of the students themselves, and not to the opportunities that students have, experiences students have in school. And so by focusing on sort of outcomes, you're ignoring all the inputs and all the experiences that students are having that lead to these differences in academic achievement. And so in our work, we often draw on the work of a scholar called named Richard Milner, who talks about this idea of the opportunity gap. So in his work, he says that we really need to focus not just on like achieving outcomes, which are important, we don't want to ignore differences in outcome, but it's also really important to understand the reasons for those differences. And in his work, he talks about all the ways that schools systematically discriminate against, particularly Black and Latinx students, and all students of color in the way that schools are structured, who do students have teachers who look like them, what are the dominant cultures within schools, how do schools, rules and policies, how do they affect students? And there are many, many other ways, including curriculum that is not culturally responsive, and standardized testing that doesn't capture all of students' abilities. So these are all factors. In this talk, I'm going to focus on one specific aspect of this, which is this idea of discretionary spaces in teaching. And this idea comes from the work of Deborah Ball, which basically said that in teaching, teachers have a lot of things that they can't control. You know, often they can't control when school happens, they can't control, you know, what their classroom necessary look like, what type of building they're in. But anyone who's been a teacher knows that there are a lot of decisions that teachers have to make every day and that these decisions can have a big impact on students experiences. And in her research, who talks about kind of really examining these discretionary spaces and understanding, what are the forces that affect what teachers decide to do? And increasingly thinking about how do these discretionary spaces either perpetuate or disrupt racism and racist structures. So for example, if you, a teacher sees a student who has their phone out, they have a number of different choices of what to do. They can just ignore it and keep on teaching. They can, you know, go over to students and say, Hey, you're not supposed to your phone out, they can confiscate their phone. They can send the student out to the principal. And all decisions have have implications further down the line. We know from research that discipline in school is connected to the school to prison pipeline. So even though you think this is a sort of individual like decision, all of these things accumulate and add up over time. So I'm going to kind of transition out of talking about our digital equity teaching simulations. So in these simulations, what we're trying to do is we're trying to capture practice in these discretionary spaces. And I wanted to give you some examples of what this looks like in practice. So this is from one of our simulations called Jeremy Sturgeon. And to give you a little background on Jeremy's journal. In the simulation, you play the role of a teacher who is has a student named Jeremy and Jeremy is at sometimes very actively engaged in class and other times disengaged. He's generally very friendly and social, but sometimes struggles to kind of focus on the actual assignment that he's supposed to be doing. And one day, he misses class. And for you don't get any reason why he missed class. And the next day, he comes to class and he presents this note from his his mom. And that and he says, I'm sorry, I was out yesterday. I was feeling great and had to go home. What do you want me to do for for make up work? And he presents this this note from his mom. And the thing is that the school policy, you know, is that you have to have a note from a doctor. So in this moment, what do you do? Response one is say, we missed you at class yesterday, hope you're feeling better, or response to remember that the school policy is that we need to sign doctor's note in order to be used from class. And I was wondering if I wanted to try something, we'll see how it works. I'm going to send out a poll. I only want you to answer the first question. But I want you to say, which of these two responses would you choose? One, we missed you at class yesterday, or two, remember the school's policies that you need to sign doctors note more to be used from class. So you just need to answer the first question. Everyone see the poll. I see. If you can't vote, you just put either one or two in the chat for which one you would do. Seeing a lot of a lot of ones. So I think we have an agreement. Many people. Yeah, I think everyone's panels. Okay. So it was good to know that that that technology is not particularly reliable. So I'm seeing a lot of ones. So these are actual responses that people gave in the simulation when we did this in the course. So these actually represent two different ways that people can respond to the situation. And what we're trying to do with these scenarios is capture these individual moments of teaching and provide people with options like what do I do? What would I do in this moment? And once you do that, give them time to reflect on why am I making the choices that I'm making? Are these choices actually? Are they perpetuating racism? Or are they disrupting it? And so why, why simulations? So simulations have a number of affordances that make them particularly useful for talking about things about teaching. One of them is actually some you might see it as a limitation, which is that in a simulation, you are representing some parts of reality, but not all of it. So a simulation by definition is not capturing everything about a real life teaching situation. But in some ways, it actually works to our advantage with teaching because teaching is extremely complex activity. Anyone who's ever sit in front of a class of students knows that there's a lot going on at any particular moment. And so what we're doing in our simulations is we're breaking it down into simple parts in order to focus people's attention on particular things. And so it allows you to really focus on discrete aspects and moves and teaching, rather than having to try to kind of comprehend a large situation where lots of things are happening at the same time. Another thing that is particularly good about simulations is that it allows you to practice those within a kind of simplified lower stake setting. And this is particularly good for novices who might not be ready to take on sort of a more complicated situation and might benefit from doing things in a lower stakes, simplified environment. But even for more experienced teachers, it's often helpful to take a step back and kind of be reflective and think about, okay, now that I'm out of my classroom, I'm out of my student, what I actually do in this situation, and they think about it and talk about it. And the final thing that is helpful with simulations is that it allows you to kind of provide opportunities for targeted reflection of feedback. And they use the term targeted here intentionally. Because one of the things we know about learning new skills is that it doesn't necessarily help to kind of do the same thing over and over and over again. What allows you to improve is to say do you have deliberate practice focusing on a specific skill set and getting feedback on that skill set. So what we're trying to do in our simulations is really be very intentional, we'll focus on specific things and using those for opportunities for reflection and feedback in order to facilitate learning. So I've kind of set up like what like some of the background about influencing our work, I talked about why simulations. Now I wanted to talk a little bit about actual types of simulations we built. So the teaching system lab has developed a platform called Teacher Moments. I know many of you are familiar with Teacher Moments either through the Teacher Systems Lab or EdTech Design Studio, or just sort of being around the CMS department. But what's really cool about Teacher Moments is that it's a platform for authoring simulations. So anyone can go and build their own simulations in Teacher Moments. It's free, openly licensed people can use it however they want. And we've had more than 300 scenarios authored within Teacher Moments and have had more than 6,000 users and more and more people are using it every day. So I definitely encourage you if you're interested to check out Teacher Moments and kind of see sort of some of the features of it. I wanted to give you a flavor of what a Teacher Moments scenario actually is like. Many times we talk about these simulations, people assume, oh, you're doing something in VR or you're doing something that's sort of, you know, different from what it actually is. And so I think it's helpful to actually kind of see what a simulation looks like. So I'm going to play about a three minute clip from one of, and I'm going to share my computer sounds so you can hear it. So this is a clip from one of our simulations called roster justice. And the background to this is in this simulation, you play a teacher who a few weeks before school, you get your your class roster. And you notice that the computer science class that you're supposed that you're supposed to teach the rosters are in balance. So even though your school is 50% African American and Latinx, the actual composition of the class is mostly white and Asian and male students. And so you go to the principal Mr. Hall, who is played by Justin in this scenario, you're going to have a conversation with him about your the class rosters. I wanted to talk with you about some of the scheduling changes. And I also heard that you wanted to talk to me. We've got someone else coming in a few minutes. So why don't we cut to the chase? Why don't you lay it all out for me? Thank you, Mr. Hall for meeting with me. I know that you have busy schedule. Before we talk about the scheduling changes, I just wanted to share my concerns. I looked over the rosters for computer science. And I noticed that the computer science class doesn't reflect our student population. It has way more white students and male students than we have in the rest of the school. And I'm concerned that it means that our students of color and our female students are missing out on some opportunities to take computer science. So can we talk a little bit about how we might be able to change the schedule so more students can take computer science? Well, first, I'd like to say thank you for bringing the issue to my attention. I get your concerns. Really, I do. Unfortunately, there's not a lot that we can do. School is starting just three weeks from now. What are some quick fixes that you or I or someone else could do right now this year? I don't think that there are any quick fixes. I think we need to take a broader look at how we do scheduling. Because of the way we schedule things, we ended up systematically excluding a group of students from computer science. And these are students who historically haven't had opportunities to take computer science classes. So I think this is serious problem that we need to address. I want to be super clear, we cannot change the schedules at this point. Sometimes imbalances will happen when you only offer one section of a course like we're doing this year with intro to CS. We're offering intro to CS during period five and it's super complicated. We're offering intro to CS in period five, we're offering algebra in period five and period one. So anyone who has intro to CS for period five just is going to have to take algebra in period one. What if I got you a teaching assistant to help with the class? I don't think it's super complicated. There are a few weeks left before school starts, we change kids schedules all the time. I don't think a teaching assistant solves the underlying problem. Why can't we look at changing when the math courses are scheduled so that all of our kids can have a chance to take computer science? Thanks for coming in. Thanks for coming in. So hope you guys enjoyed that scenario. I appreciate having Justin play the angry principal. I know that one of the things I find, even though I've done this scenario many, many times is every time I do it, I still feel that jolt of like, oh, I'm talking to the principal. Like, what's he gonna think of me? Like, how can I make this argument? I think that that kind of shows the affordances of these types of simulations that even though I'm just talking to a video, I know that the video isn't going to actually respond in the moment. It's not a real person. I still feel that that emotion as I'm going through the scenario. And a lot of our simulations are like this, that even though they're not necessarily capturing all the authentic, what it would be like in the moment, there's something about it that makes it feel very authentic. And we found we've done this with now thousands of people, people often say, yes, it does feel authentic. It feels like something that would actually happen in real life. And that's really a testament to all of the people who work to design these scenarios to be really impactful. So last spring, we launched the course course called becoming a more equitable educator. And we took these simulations, and we put them within within the course. So the way that the course was structured is that within each unit, you would sort of be introduced to a topic, and then you would do a simulation about that topic. And then after you did the simulation, you would watch a video of other teachers, we actually went to different schools around the country, did the simulation to teacher and film the debrief of a debrief with them doing the simulations that included interviews with individual people talking about the decisions that they made. So even though it was an online course, this is obviously last March, like right when COVID was starting. So many people were sort of turning to online learning at times, even though it's online, you're able to see kind of what other people are thinking and doing in that moment. And the course itself was structured around this idea of educator mindsets. And we framed it around four pairs of mindsets. And the idea is that these are things mindsets that are out of bounds in US schools. So just to give an example, one of the mindsets that we looked at is equity versus equality. So equity is that sort of everyone gets the thing that they need. And if some people need more things, they should get those things. And we should focus on individual need, not on giving everyone the same thing, while quality means that everyone gets the exact same thing. And we shouldn't give people special treatment. Now, what we argue is that it's not that equity is good and quality is bad. But the problem is that these mindsets are currently out of balance, that we have too much equality, and not enough equity in schools. And what we want to do is we want to shift people's voices thinking about as they make these decisions, how can I act more of an equity mindset. And so in all of our units in the course, each of them was framed around these pairs of mindsets, and there was a simulation that was attached to each of them. So this is all kind of a setup for the research that we ended up doing. So in the in the course, there were 963 people who did at least one simulation. And there were four simulations in the course, each of them that was embedded within one of the units. So it says it's difficult to read the presentation. Can everyone see my slides? Do I need to make it bigger? I think if you can just if you don't mind sharing the link. Sure. Yeah, I'm happy to do that. Yeah, I'm happy to do that. I will put it in. Thank you for that feedback. I'm putting the link in the chat so people can take a look at it. Okay, thank you. I will go back to the presentation. So as I was saying, we had four of these simulations embedded within the course, each within one of the units. So we had a lot, a lot, a lot of text data, probably more text data than any person could look at qualitatively. So I was interested in how can we use natural language processing tools to automate some of this analysis to understand what are people doing in these simulations. And so I looked into different ways of using natural language processing. There has been a lot of research about that has used natural language processing within large scale datasets. Just to give a few examples. They've used it to do automatic scoring of cognitive tasks or machines, scoring of assessments. People have used it to predict people's effective states. So there's one study that looked at what are students feeling is about math, and they looked at people's responses within an online math curriculum, and correlated that with student sub-efficacy for math. And then a study that Justin actually worked on was looking at responses in discussion forums and trying to see, and this was, of course, that was a political force. So do conservatives and liberals, how do they engage with each other within discussion forums? So there's a lot of different ways that you can use natural language processing to make sense of large datasets. And I was particularly interested in, can we use these tools with our simulation data? Can they provide some information about what are people's experiences within these simulations? And the particular method I used was something called structural topic modeling. So a topic model is a model that detects underlying patterns within large datasets by identifying latent topics within a text dataset. And what's nice about topic modeling is it does not require any a priori assumptions about the structure of the data. So you don't have to have labeled data in order to use topic modeling. It draws the topics from the data itself. So this was particularly good because we didn't have any labeled data. So it was particularly good for this tech dataset. And we're also interested in exploring like what are people doing within the simulations? We used a, what's called a mixture model. So this is how the topic model works. It basically estimates a probability of a topic appearing within a text and a word appearing within a topic. And I'll kind of give you a little illustration of how that process actually works. And the particular form of topic modeling that I use is allows you to include covariates that allow you to see associations between a topic appearing and some characteristic about a person. And I'll kind of kind of dive into that a little bit later and show you how that work within our analysis. So this is kind of the pipeline for how structural topic modeling works. So kind of the input is is your text data. In our case, it was sort of a big Excel file with each row is a different response within the simulation. So mostly simulations you saw sort of three of the ones from Roster Justice. All the simulations were basically kind of prompts and then people wrote in wrote in or said their answers and we automatically transcribe those answers. So you kind of have this big, you know, basically a cell fire of data. So you take that data. And the first thing you do is you have to process that data. And this is important because within those pieces of text, there's a lot of words that are not that informative. So things, conjugations like and or prepositions, pronouns, like, all those don't really provide you a lot of information about what's happening in the text. So the sort of general accepted roles to kind of pull those out of the of the text that you're looking at. Another thing we did was stemming, which is to say like, getting rid of all of the conjugations. So like, you know, for example, explore, be explorers exploring, explore exploration, we took all it took all those and kind of made them one single word. So that way, it focuses on the content and not on the particular way that's used within a sentence. Another thing that we did is we separated it out by line. So I was interested in topics appearing within individual responses. And I thought that there might be multiple topics. And so a way to kind of pinpoint when a topic is occurring is by separating that into individual lines. And this kind of ended up giving us, you know, even more rows of data to look at. And finally, under the recommendation from the authors of the structural topic model, I removed infrequent words. So words that didn't appear that frequently within each document took them out. We ended up still with, you know, over 1000 different words within each simulation. And kind of what you end up with is this document term matrix. So it's basically like think of like a big Excel file. And each column is a word. And each row is is a different piece of text or what's called in the in the natural language processing world documents. So basically, if a word appears in the document, it's coded as one. And if it's if it if it doesn't, it's coded as a zero. And this is for every single document in the text. And what the structural topic modeling does is it takes that really big matrix. And it kind of looks for correlations between words within documents and documents that are have the same word. And it kind of spits out a set of topics that have certain words that are associated with them, and and documents that have certain topics that are associated with them. The challenge is that you have to specify in advance how many topics you want. So the top of model will spit out 60 topics, it will spit out five topics, you have to figure out kind of using some metrics, how many topics do you actually want the model to produce. And there's a number of different ways that you can do that to kind of figure out what is the right number of topics to extract. The larger issue is that the topics don't come pre labeled. So it's not like it's fits and answers. This topic is about, you know, baseball, if you have to to assign those labels yourself. Now, sometimes it's more obvious and sometimes it's less obvious. Those of you if you're familiar with factor analysis or cluster analysis, the same issue where the the model will kind of group your data, but then you have to as the research to figure out what does this data actually mean. So I'm going to walk you through an example of how we kind of came up with labels and this will sort of kind of show you how the topic model works. So these were two topics that appeared in the model that I ran for Jeremy's journal. So one of them is topic seven. And in this graph, the purple dot is the probability of it appearing within topic topic seven. And the was actually reversed. But the the green dot is the probability of it appearing in topic 10. And so I was interested in this graph illustrates both the probability of a word appearing in topic seven, and then the probability of it appearing in the other topic. You can see that certain words are more likely to appear within one topic, and less likely to appear in within another topic. So for topic seven, the words today, yesterday, better feel are very likely to occur within that topic. Well, for topic 10, Dr. Akson's note, mom, those ones are more likely to appear. So this is kind of giving us some sense of like, what is actually going on in that topic? So now I'm going to give you a phrase, you're gonna have to, I want you in the chat to say, which topic do you think that this piece of text come from? Is it from topic seven or topic 10? So just to give you a reminder, this is the this is the words for topic seven. And these are the words for topic 10. So I want to look at this thing and just sort of this, do you think it's a topic seven or a topic to the, I see a lot of sevens. And you're right, the model agreed with you. Instead, there was a 75% probability of this, this, this piece of text containing topic seven. Now, let's try this one. Remember the school's policies that you need to sign doctors note in order to be a student class. So is this topic seven or topic 10? Just right in the chat. Seeing a lot of 10s. And again, you guys are right. The model is training you exactly. It was a 60 models that there's 65% probability of topic 10 appearing. I should say not all of the topics where this clearly delineated between two sometimes it's a little more ambiguous, what exactly the model is detecting. But we what we did is something kind of similar to this process where we look at what documents the model said was likely to that topic was likely to appear in. And the words that were highly associated with that topic. And kind of through that process, me and some other graduates, research decisions, graduate students, we came up with topics. So for topic seven, we label that lab Jeremy is feeling better. And for topic 10, we labeled it doctors note and school policy. And we did this for all of the topics that were extracted and all of this all of the simulations. So once we extracted the topics, what we were interested in was, are these topics associated with anything? Do they are they associated with participants attitudes toward equity? And remember, I said earlier that the structural topic model allowed you to include predictors. So in this case, we want to know, was there a relationship between what participants said on surveys in terms of their attitudes for equity, and we had surveys for each of the mindsets in the course, there was a survey for equity and equality, there's a survey for deficit and asset. And we wanted to see is there association between how I respond to a survey and what topics I noticed within the simulations. And what was really exciting for us was that that there was a pretty strong connection that certain topics you were more likely to notice if you had reported on a survey more of an equality mindset. And there were certain topics that you were more likely to notice in your equity mindset. So what's interesting is that the doctors know people who mentioned the doctors know are much more likely to have an equality mindset. Well, people who asked Jeremy how they were feeling were much more likely to have mentioned service than equity mindset. And we found this in all of the simulations that we did that there was association between what topics you noticed, and your, your mindsets on surveys. So this was exciting for us to think about like, our simulations, these simulations actually measuring something real about what people think and believe having to do with equity. But that's sort of good for like a measurement piece, but people Justin and people working also interested in did these courses actually work? Like if you actually change in their mindsets over time? So I wanted to look at is there a way that we could use these simulations to understand are people actually changing in their mindsets over time? The problem is that we had four different simulations. So you can't compare the topics from one simulation to the next, because they were totally different simulations and totally different subjects. So the topics are not necessarily comparable. But what we could do is we could compare you to a set group of people. So what we did is for each, for each simulation, I calculated the maximum probability that any topic would appear in your responses. So you kind of end up with a with a table like this. So for each topic, each row is an individual user. And there's a probability that that topic would appear in any of the user's responses for that simulation. And so this is not real data, but you can this is sort of what it looks like. But some people were more likely to mention one topic, some people more likely to mention other topics. And then I was interested, okay, so we have all that data. Let's compare them to the people who started the course with the highest equity beliefs on the survey. So I took the top quartile, top 25% of people on their pre survey, in terms of their, their educator mindset beliefs, and where they more shifted toward equity or quality or asset versus deficit. So I wanted to see like, do other people in the course, do they become more similar in the topics that they're mentioning in the simulations to those group of high equity users. And so I'm going to show you an example from our first simulation. I want to go back and explain something. So how do you see how similar people's topics are to each other? Well, use this concept called Euclidean distance. So those of you who remember trigonometry, you know, b squared, and you know, the Pythagorean theorem. So this is basically like, what is the distance between one point and any other point. And I basically took each person's topics and looked at, okay, how similar is my responses to this other person's responses. And the closer my responses were on to their responses, the more similar, I would say that my responses are. And so you can see in this example, like, if these are like the top reference group, this person's responses is closer to this person's responses. And this is just not what the data really looks like. It's actually not two dimensions, it's many, many, many dimensions. But this is sort of basic idea what we're talking about, we're talking about distance. So in Jeremy's journal, the people in the fourth, the first quartile, so with the lowest equity beliefs, were the furthest away from that reference group. So their responses were the least like that top 25% reference group. And these percentages here, it represents sort of on average, for any topic, how far percentage wise my response was. So it's not a huge number, it's not like, you know, 30% 60% versus 30%. But on average, there's about a 4% difference of any topic between someone in the highest group, and then someone who started in the lowest category. And so as interesting as this is the first simulation, how do people change over the fourth simulations in the course. And so it was really exciting for us is that in the first simulation, you see that one, these other people in the first, second, third quartile, they're all kind of further away from the people in the fourth quartile. And as they progress through the course, they all get closer, both to each other and to the fourth quartile. So I think what this is showing is that people are becoming more similar over successive simulations. And in this data, I'm only looking at people who did all four simulations in the course. So it's not that, you know, people are dropping off and the last equitable people are leaving. Basically, we're actually seeing people are changing to become more similar to people in that fourth quartile over the course of the course. So just to summarize, natural language processing tools such as structural top modeling can help with these large simulation data sets to understand what's happening in these simulations and to identify topics that are emerging. And what was really exciting for us was we would kind of bring these results to our designers and they would say, yeah, actually included that thing in that simulation on purpose. I want to see how people respond. So it's interesting that the machine learning model was actually able to pick up on some of those nuances. The second thing is that what you notice when people notice and mention the simulation was associated with their beliefs and attitudes. So it suggests that the simulation is capturing differences between people and how they respond to different types of situations. And finally, that interesting way of evaluating learning is that we can see by comparing simulations over time, you can see how people changed in terms of their beliefs and did they become and becoming more similar, converging with one another as the course progressed. I know we're running short on time. I just wanted to briefly talk about two future directions. One is that we're going to put in a grant to do a study where we're actually going to be doing this with teachers in grades three through eight in five different districts across the country. So though our MOOCs have a lot of educators, usually just people who kind of randomly come into the course or hear about it. So in this way, we're actually going to go out and recruit people and we're going to be able to link it to data about their students. So we're interested to know, does your response in a simulation, is that actually predictive of student outcomes? And because right now we don't really know whether these these behaviors are actually predictive of anything outside the simulation. So we're hoping with this study to actually understand a little bit more about how is behavior in the simulation connected to behavior outside the simulation? Ultimately, how does this affect students' experiences? And then this other thing that I'm really excited about is this idea of being able to give automated formative assessment and feedback within the simulation. So I'm going to play a short clip of what Ross or Josh, Ross or Justice could look like in the future. So this is like from the last prompt that you've heard. Anyone who has intro to CS for period five just is going to have to take algebra in period one. What if I got you a teaching assistant to help with the class? So now I'm writing that does sound super complicated. Yes, a teaching assistant would be great. So I'm submitting it and then something pops up and says remember that your main concerns the imbalance of classes, does your teaching assistant solve that problem? So this is actually feedback that some of the facility were given person. So we're hoping to use machine learning models to actually be able to detect like what are people saying, yeah, teaching assistant would be great to be able to give them that automated feedback. The Sun and Marvis actually has been doing a lot of work on over the past few weeks. So I'm hoping that this is something that we could actually implement in the near future. So so thanks again for for listening to my talk. I'm glad I was able to come and talk to you. And I'm really excited to hear your questions and to hear kind of what you think about all this. I'm going to open it up to questions. I have one myself. I guess I wanted to get a sense of I wasn't quite clear on when you showed the simulation the first time. Yeah. How how that works in terms of how it's been how the simulation has been built to respond to verbal responses. Is it and what responses like pre-recorded responses from the principal? You know, what is the spectrum of responses from the principal that that are going to be potentially chosen you know, algorithmically to respond to various responses or various inputs from the person who's going through the simulation. Yeah, that makes sense. That's a really great question. The short answer is that it does it now. Like there's no algorithm, no matter what you say, Justin will give the same response in that simulation. I think there's actually two things. One is that we had it when we built this we hadn't sort of started to develop the technology to be able to respond in real time to have more responsive video. But the other thing is that often if you're having an argument with someone, like what you say does that you're actually not talking to each other. You're actually like having a conversation where like you say something, the other person just sort of ignores what you're saying. So we've actually found with the argument types of simulations that it actually works pretty well to have somebody have a conversation with someone who totally ignores what they're saying. And the person who designed this also was really knowledgeable about schools and how these conversations works. And so when she was designing, I think she really thought about like the types of things that a principal would say in that moment. Yeah, I was actually going to say that that the lack of varied responses from the principal is probably closer to reality than if you tried to create multiple responses. Yeah, I have other questions, but I want to make open the floor to everyone. Well, I have my next question, which is just related to the the future directions that you just spoke about. And you were talking about the potential testing in the field where I just remember the number two hundred and fifty two. I don't know. I don't remember if that's two hundred and fifty two teachers or or schools teachers, teachers. Right. OK. And so I guess the question that I have there is if they are self selecting to take part in the simulations, is it, you know, how are you correcting for the possibility that those teachers who were open to to going through these similar simulations that are geared towards, you know, shaping or geared towards guiding people towards a more equity based mind mindset that would that be a self selecting set of teachers who would already be sort of a closer to that mindset or be more willing and open to have their mindset changed through a process like this. That makes sense. Yeah. So I actually think what you're pointing out is a good observation about the course, the online course that we did because, you know, we the courses is sort of open and free and take it. But the types of people who take it are the types of people who already care about equity equity issues. So one of the challenges kind of right up front is that there wasn't a lot. There wasn't as much variation in the types of responses as we would have liked. So like even those, I would say, like, you know, I said the low equity, the first quartile, but that first quartile probably most would actually be probably like some of the more equitable educators within that school, the people who are probably like the most in the other course are probably the least likely to be taking it. I think and one of the things that we want to do with that study that I was proposing is is do more active improvement of teachers. So we're going to be working with districts to sort of identify schools to participate in study will actually be like recruiting teachers and paying them money to participate in this course. And so I think what we're hoping is that we'll get a broader and more diverse group of teachers than the types of teachers that we got and took the move and hopefully we'll be able to reach teachers who wouldn't necessarily like pursuing an online course that could be on their own. Roya and then Tomas was at and I was curious and sorry I'm outside so if you can't hear me, that's why I know that something we thought about a lot that I work on the lab is sort of how we transition a shift in mindset to a shift in practice. And I would love to hear kind of your thoughts on that in terms of this work and where you see that going with kind of the implementation of teachers. Yeah, I mean, I think it's it's a it's a measurement challenge, especially in a move because, you know, we get we don't know in advance who who was going to take our course and they kind of show up and, you know, it's not like we can like be like, OK, we're going to go to your class on this day. We actually were planning at one point to send Chris Buttermur the postdoc who works on this project. We had a whole plan last March that he was going to like find schools and contact them and travel all around the country to do observations. But then obviously COVID happened, so that's not that kind of was not no longer a possibility. But I think that it will be like I'm really excited about the work that you're talking about in Spire Man. I think it'll be really interesting to actually see kind of what what what our teacher like if someone does something in simulation, is that related at all to to to to their actual practice? And is there is there a correlation there? I suspect that that there is based on like what we've seen in terms of like survey responses. But it'd be interesting to kind of actually connect that to actual practice. Thomas. Yeah, my question was really similar to to Royas. But I think maybe I can frame it in a different way. So yeah, first of all, thank you for the great presentation. It was extremely interesting and I'm really excited to to what can be done with methodology. So this would assume from the assumption that there is a goal to increase an equity mindset, right? And I was wondering, I mean, this is something that I can probably get behind out and behind up. But I wonder if if there are other possibilities to do this, like with other values, other normative values that that maybe are sourced from teachers and maybe like not from academic labs, but to see what what the other teachers have to say about that. And I was also wondering if. Yeah, so I think this is pretty much what you just answered to Roya, but I guess the what will be the conclusion from from from this method is that you can tell people that certain certain. Yeah, actions are associated with a certain mindset, but I wonder what's like the extra step, right? Like, and I guess that's what you address in the MOOC. So I was wondering if you could like discuss what are the contents of that work? Yeah. Let me know if I understand the question right. You asked him like, you know, what is like, OK, we can measure these things, but how is it actually going to change people's practice? Is that kind of subject? So it's two different questions. The first one is whether you could like chime in other normative values instead of equity, if you can think like, you know, if they want to make more solidarity teachers, like what does that mean? And the second question would be to if you could talk about the MOOC experience, like how can you turn this insight into specific practices? Yeah. So I think the first thing is that we've done. So equity is sort of one area that we've worked on, but we've done it in a number of other areas, including Roya has has done a lot of work on math instruction, which is less it's less a question of in where you can disagree or maybe it's wrong. It's more of a sort of teaching sort of skill sets than a kind of normative value, although there probably are more values that are baked into those particular skills that we're trying to teach. We also, Marvis and I worked on a course called Civic online reasoning, which was about preparing teachers to teach students how to identify misinformation online. So it's based on a method developed by a group at Stanford, called basically to call lateral reading where you you if you see something online, instead of like spending a long time reading through it and trying to determine is this real, is this not real, which is sort of the traditional way that teachers have been taught to how to teach information literacy. They say that the best thing to do is actually Google it to Google and see like who is behind information. Another important thing in that that course is that many teachers and many of you probably experienced this, like I think that Wikipedia is bad. Like basically that Wikipedia is totally unreliable as a source. But you know that actually, you know, there are some issues with Wikipedia, but it can be really reliable, especially for finding sort of basic information about something. And Wikipedia also cites a lot of sources. So you can go to those sources and look up the information. So that we actually developed practice based practice based simulations to to like have people kind of do these the exercises to see how much they've learned about that particular skill set. And Marvis has as I hope this is a camp like in your work that they've done this really cool classifier that basically detects like if response to one of these tasks and say like, OK, did you figure out that this tweet was from a parody account or or not? We're hoping to use those to give people feedback on learning. I think ultimately, so like it's answered to your second question. So we have some data from our course and it's all self reported. So, you know, take whatever you need to that people over the course were more likely after the course in particularly four months afterwards to discuss equity issues in their schools and to participate in networks around equity. So there's some evidence that the course after taking the course, people are more likely to engage in actions around equity learning. And I would really be interested in learning more about like, what does that actually look like and is there a relationship between taking the course and doing these things? That's actually one of the reasons why we're posing this study is to do a randomized experiment where we assign randomly assigned teachers to take the course and we see this actually change their behavior. So I think it's definitely something that I'm really interested in as research and also interested in, like, can stuff like a online virtual course can actually affect behavior in the real world. Thank you. Anbar, and then there's the question in the Q&A from Will. Anbar? Yes, thank you, Vivek and Joshua. I was wondering, so when you may, like when you ask us, like, which option would we choose? We all chose one, right? And like, that relates to equity, but I'm just wondering, like, even though, like, if teachers answer one and they are more likely to think about equity, also, I guess, like the school rules also affect, right? So I was wondering if you have thought about actually making simulation to change the way the principal thinks about it rather than the, like, the professors because they could, you know, like, they could think about equitity, but, you know, like, the rules are the rules, so I'm not sure. Yeah, yeah. And I think that's a really big point and something that, you know, comes up a lot in the course is that, you know, we intentionally frame the course around these sort of individual moments in teaching. And the reason we did that is probably because those are the easiest to change. So, like, I as an individual teacher, I can, like, make the choice to, like, interact with the student differently or push for something or, like, bring something out with my colleagues. It's more difficult to sort of more systemic issues. I think our theory of change is that, like, if people start to notice these things in their practice and they start to talk about with other people, that kind of builds up the momentum to sort of change some of these bigger things. So, like, you know, an individual person saying, like, hey, actually, like, let's think about how our school's policy about the document, how does that affect, you know, students who may not have health insurance or may not be able to easily get to a doctor, may not have the transportation? Like, how does this policy affecting those particular students? So I feel like, like, we start with sort of the individual and that can kind of lead to some of the more systemic changes. Great, there's Will's question and then Emily added on to that question. So I'm gonna read these so that others can hear them but feel free to follow along in the chat part. So Will writes, I'm interested in the notion of discretionary spaces in classrooms and the types of situations in which they arise. I'm curious about how you go about deciding which specific types of discretionary spaces slash scenarios to focus on in the simulations that you develop for the courses. And then Emily adds, it's also interesting to think about whether or not teachers understand them as discretionary or not in particular contexts and would interact with changes in their interest in enforcing school rules, which, yeah. So we... Yeah, so all these scenarios are developed based on sort of real things that happen in schools all the time. I know my wife is an educator and she's always like, I'm always like, oh, okay, just an example that you just did but I think that's a perfect one for teacher moments. And so I think kind of like the characteristics of a good like discretionary space is one where the answer isn't obvious. I think the one I gave you actually was a little bit more obvious than like a really good Will and V but like has some sort of tension between like, do I want to do this thing or that? Because if it was an obvious decision that everyone would just do that but there has to be something kind of counter nailing them that would make you think, well, maybe I don't want to do that. One of the things at the end of the Jeremy's journal simulation is that he asked because he's missed school and because he asked if he can be excused from taking a quiz that you give every week. And so he comes up to you and was like, can I get out of this quiz? And then ever we've done this, we've got many, many different responses and there isn't really a right answer. So like if you give him the quiz, you're like not equitable and if you don't give him the quiz, you're equitable or vice versa because there's reasons for wanting to give him the quiz. I think what's important is like your reasoning behind it. So maybe we'll say like, oh, I'm giving him the quiz because I want to know how he's doing and then I'll give him feedback and we'll talk about it and I'll understand more about like what is going on with him versus someone saying, oh, I don't want him to give him the quiz because like everyone else has to take the quiz and that's just the rule and he's got to adjust. This is the real world. Like no one's going to care about his situation at all. So like the decision is not as important as the reasoning behind it. And that's actually why I think that the natural language processing is so important. It's because we couldn't just construct these scenarios as like multiple twists, like do this or do that. But it doesn't kind of tell you like why, like why do you want to do this? And that's where I think some of the text analysis stuff really comes into play. It allows us to understand some of the nuances of why people pick something and not the other thing. Marcos. Great presentation, Josh. Well, I just was wondering if you could go into a little bit more about like the development of the scenarios, like who are the designers and like how we plan for research, especially in the context of IES where we're going to be doing it over all these districts with all these different teachers in different contexts. Yeah. You're saying like how do you, how do you design it to work with different types of audiences? Yeah, I mean, I think like one of the challenges, especially in a MOOC is that you're doing it for a global audience. So you're doing it for people who are, who are like may not be as familiar with certain things in the US context. I think what works well as a scenario is like the universal enough experience that many like people could identify with it. So even if the specifics are not the same, I know this is actually something where I thought like originally like, oh, people aren't going to get this Jeremy Sparrow scenario because it's so like specific to like US schools. But when I actually, there's a data people, a lot of people got it really understood what was happening there, even if they don't understand, even if like the specific details aren't important. And that's where I think that the like the characters and their arc is really where, and I guess this kind of connects to the media piece. Like this is what makes the scenario meaningful and meaningful as a learning experience is if you can kind of connect to what's actually happening in the, in this situation. I have another question that is more, I guess, geared towards taking this opportunity to talk about methodology since half of our students are taking a media research methodologies class with me at the moment. So I wanted to just ask you about the research design and sort of iterative design-based research specifically in relation to the natural language processing and how you've built that out. What are the steps when you're building that? What are the steps that you go through in order to fine tune how that works? Yeah, so it was a very iterative process, I would say. It wasn't something where I was like, I know exactly what I want to do like from the beginning. I had like, we had been doing these scenarios for some time before we did the course. And so I had some understanding of like what the universe of possible responses were to these types of scenarios. So I knew that like there were certain patterns to like how people would respond. And like there were kind of like, you could kind of like see like, oh, people respond like this way or they respond this way. But all that data was sort of like a small step. So I didn't actually know exactly how it would work when we kind of did it a larger scale. The other sort of like decision that I had to make, I hope this isn't getting too into the weeds, is like thinking about the goal with using the natural language processing. So like one approach could have been to say like, okay, I'm gonna like score each of these responses and like see if I can build a kind of classifier to predict like what are people, are people, you know, acting equitably or not equitably. But I kind of wanted to leave it open-ended to feel like I'm not gonna like predetermine what is equitable and what is not equitable. I want to see like based on the data, like what are people, what are people noticing? You know, what are the differences between people and what they notice? So that's kind of the like framework I used and why I was sort of leaning toward more unsupervised models that would allow kind of allow the data to sort of like generate. And then once I had it, then that was a kind of like as I was saying, labeling is not always, is it kind of a difficult process? Cause you can kind of, you have to make a subjective decision about what the data means. And that's where I actually did a lot of conversations with the people who designed the simulations, looking at the data, talking with other people, getting some kind of validation by thinking about like, okay, do other people see the same things that I'm seeing? Cause it is a, it's subjective, but you want it to sort of be like transferable. So other people could kind of come to a similar conclusion. And then that's one of the challenges with this type of analysis is that there is some, like research or subjectivity and you don't want to sort of hide that behind the like objectivity of like, these are the numbers. You want to say like, this is the process that I use to kind of come to this conclusion. But actually, we actually did, I actually had people like, who weren't involved at all in the design of the model. Like I gave them like a task where they had to choose which tasks, kind of like what we did in this exercise, like I like selected a piece of text and then like gave them a bunch of different topics. And one of them was the one that the model said it was most associated with. And so we actually could predict like, had them say like themselves, they're basically doing what the model is doing, like, you know, which topic is most associated with this piece of text. And it actually was, we matched the model about 65% of the time. So it wasn't like a perfect connection. And often it was in cases where the model was kind of iffy about like what was actually in the text. So I think it's like one of those cases where like, the more like cases where it's more clear cut, it was easier for people to kind of correspond with the model. In cases where it was less clear cut, there was more ambiguity there. Just a quick follow-up is, you know, you touched on just now the, how a particular response, you know, might be, I think you were using the example of whether or not to give the student the test and how they're, you know, one, two teachers who decide to give the student that test might have different reasoning. One of which is more aligned with the equitable mindset and one is not. Two, and maybe this is partly, you know, already answered, but I'm curious about how that kind of qualitative assessment is built into the overall project. You know, whether, you know, so that you can, like you're saying, you're not predetermining in the model which kind of response corresponds to equitable mindset and which is not. And so that in those kind of like fine areas or gray areas, it seems like more qualitative engagement with the subjects or with the teachers is sort of what will give you the data that you need. So I'm curious how, you know, the kind of interplay between that kind of qualitative method and the kind of modeling that you're doing. Yeah. Yeah, I mean, I think I see like Justin's comment about like using a supervised approach is to compliment traditional qualitative, bottom up, ground to theory types of analysis. And I think that's actually a good analogy for like, how do we, you know, look at this type of data because, you know, in the course, it was probably, you know, over 100,000 rows of data. So, you know, we could have coded all that data but I probably still keep working on it now. And what was really interesting was like, once you see, once you kind of can see what's happening in the data, then you could sort of start to put on some of your qualitative lens and kind of do some more looking at like, what does this actually mean? I'm sort of, my training is quantitative but also done some mixed method qualitative research. And I'm sort of interested in kind of that space of like, how can we use like large data numbers but also kind of understand like, you know, more qualitatively, like what are people thinking and how is it, you know, that you can actually, it's more difficult to capture with like a multiple choice like survey. So this is why like I'm very excited about this type of methodology going forward. Thank you. Other questions. Well, we can, we're almost at 630. I have one last suggestion. This is Justin. Sorry, my house is chaos. So I'm not gonna, that's okay. That's okay. The students got a bit of a taste of the chaos in my house at the end of the last session. If we have just a couple of minutes, if folks have any sort of either substantive or stylistic suggestions for Josh, thinking forward to head to his next presentation, if there are parts of the presentation where he lost you or became uninteresting or problematic or if parts that like really connected with you and engaged you, I'm sure any of that feedback would be helpful if he thinks about that. You know, Vivek, you've seen lots of these kinds of talks too. So if there are thoughts that you had, I think that would be really, really welcome and valuable. Josh, who's gonna be your audience? Sorry. Who's gonna be your audience for your talk? So this is, it's a, at Northeastern, it's their College of Art and Media Design and also applied psychology. So there are new people who are more sort of like media designed folks and there are also new people who are more kind of like working in schools as educational psychologists. And so it's sort of, actually, I think this is a good representation of the diversity. So the people who are like, you know, may not know that much about education but know a lot about design and people who know a lot about education but might not know that much about design. They're also part of, and that's why I spent actually more time than I would normally is they're interested in someone to teach statistical methods. And so I wanted to also showcase some of my own ability to explain statistical concepts in a more lay audience type of way. From the standpoint of someone like who was like definitely on the media and arts and nice things that don't have numbers. I gotta say at some point you lost me with the technical stuff but so I would consider framing it like as more look at the story that you're telling and not just start from the methods because that might be awesome. Yeah, that's helpful. Yeah. Something that was difficult for me was when you were asking us like to choose between seven and 10, just because I couldn't like see, you know, like I couldn't like connect with the other one. So. Yeah, yeah, yeah, yeah. Yeah, that's, yeah, I thought about that. It might be good to have like the visual there. So you can see the word distribution. Yeah. I guess one thing and this was behind, I think one of the questions that I asked was, I was the first time when you were going through the simulation with the video responses from Justin. It wasn't, you might wanna do a bit more contextualizing there in terms of what the user experience is. Yeah, exactly, yeah. And in part, I think it was, I was hearing your voice in audio, right? But it wasn't in real time. Yeah, okay. So then I got confused about like what, for a person who's actually going through that simulation, what are they inputting and how? Yeah. And then how are they experiencing the response? Yeah, that's the point. Yeah, I actually thought about that. It doesn't seem like, people thinking of everybody in the talking in real time. Was it helpful to have that example? Like, do you think that is helpful? Should it be shorter? I appreciated that example. Yeah. Because in part because of, you know, showing people what the actual experience is gonna look like for the teachers who are going through the simulation. And that's why, you know, kind of fine tuning that so that it's clearer what that experience is, I think would make it that much more effective. Yeah, that's really helpful. Yeah, I can definitely do that. Just as a point of clarity, I was wanting to make sure I understood properly that people can respond however way they want when they're in the simulation and those two examples you gave us at the beginning were just like common examples that represented. Yeah, yeah, yeah, I think that that, yeah, that's true. So there's sort of like, these were two responses that people gave and like trying to give options. I couldn't have it actually be like open-ended. I just wanted to sort of get the sense of like, the binary, but I could probably make that clear that these were like examples that people gave and that it's not like in the simulation you have to choose between these two. Yeah, I think that's been to the documents that you're feeding into your model would be helpful. Yeah, exactly. But yeah, I think that I could make that clear. Then now that I understand, just to follow up on that, one question I had was whether after getting these results, you sort of looked into some of the responses themselves to see whether the kind of, I think, convergence you were showing with like a meaningful convergence or whether people might be using the same kinds of words but in different ways based on- Yeah, that's a good question. That's actually not something that would be done. I've thought about it, but I haven't actually kind of like done a more, I think you would sort of need to do a kind of qualitative analysis to actually like look at like an individual person. I think that it is very possible that it is people are picking up like terminology in the course and then using it in the simulations, which is not like a bad thing. Like I think that's actually kind of what we want is like people to sort of like learn things and then apply it. But I think, and I hope this didn't come off as saying that, I'm not saying that now they're like becoming more equitable and now they're actually in the courses and they're like doing all of the equity practices. I think this is a longer term process. I think what we're showing is that people within the simulated concepts are being more cognizant of the type of issues that someone with already a high equity mindset would see in that type of simulation. Great. Thank you so much. And in order to not repeat last week's explosion of my daughter from the background, I will thank you and thank everyone who joined us tonight and thank you all for your questions and your feedback and break a leg. Well, don't literally break a leg, but good luck. If you have any feedback, please feel free to get in touch with me. My email is JLTBIAS, so any feedback is very welcome. Great. Thank you. Bye.