 Hi everyone, I'm Jason Clark. I'm a professor at Montana State University in the library And I'm going to talk a bit more about a particular set of research where we're trying to think about how do you turn? Algorithmic experiences into forms of data So as far as what's ahead I'm going to provide a little context for where this research started and what some of the absence The bits of that research that were kind of unfinished to me and where you're going to understand or I'm going to show parts of where the next cycle of this research are going and it's really As we started this research we were thinking through how do you teach how do you teach about algorithms and teach to all audiences? From citizens were land grant University So Montana citizens to students to researchers How do you how do you build an audience? How do you teach these sometimes these complex ideas to? all levels of learning In this particular session I'm going to talk a bit more about algorithmic user experience Then pull pull through a potential auditing technique and talk a bit more about research implications So the initial research was funded by the IMLS for us. That's our primary granting agency to the Institute of Museum and Library Services The grant was really about Looking at particular forms of user algorithmic user experience and as the Sort of a teaching moment with algorithms is where we started and what we did was things like Finding ways to document. What are the first principles of an algorithm? How does a weighted graph network work in Facebook? Hypothetically right and then we would we would talk and in workshops or in teaching settings right? pseudo code to just kind of get people thinking in In the code mindset and about how you when you make decisions with a program what that means So this this work was really kind of focused on the pedagogy of the algorithm and you know, this this is the type of You know in a workshop setting This is this is the type of thing that worked well It's just introducing a broad concept giving people a chance to see how it might impact them materially We did exercises with them digital redlining Decisions we make insurance companies might make So there was a lot of ways into this this teaching space But there was a component of this research that Was always a little it felt unanswered to me There are these experiences that we have whether it's in tiktok Whether it's in YouTube on Instagram Primarily in social media is where I was thinking initially Well, we're not really able to Understand What exactly how those those How those algorithms actually might manifest as a data point or as as a data set And what was what I was struggling with was how do you think? How do you how do you start to? Quantify something like an algorithmic user experience like the idea of the timeline in Facebook or Even the Twitter timeline So I wanted to start to answer like what if they what if these things were a little more visible or if we came up with a pattern a set of pattern recognition Where we could actually say okay, we captured a Form of that user experience and encoded it as a kind of data set So in that initial research, we were able to kind of pull apart all kinds of experiences of algorithms If you think about the top one here the masking and this one's probably a little harder to quantify and I know and this demo that I'm initial doesn't really talk about this but I the idea of masks of the things we can the filters we can put on ourselves inside of Tiktok or Instagram that's a form of an algorithmic user experience lawyer cat is the sort of the impact of Algorithmic user experience But again, how do you how would you start to? Capture that experience as a data point And we don't have to go far. I mean, this is last week on Capitol Hill for those of us in the United States Washington, DC hearing related to how algorithms were working in Social media was kind of interesting about this particular meeting or our hearing was that there were policymakers and Policy Analysts and academics talking about the role or the place of algorithms in these environments So for this initial cycle as as the end of that that first cycle of research was was coming about a group of us that were still on the end of that grant and those partners are Mariel Keringa, Julian, Kaptenian, and Tyler Bass. These are these are undergrads that were that kind of worked throughout the project To just just answer some of the questions that we were talking through the first part about the kinds of kinds of ways you might teach or introduce these these concepts but one of the things we all settled on was an idea of the YouTube recommendation This is a little dated from 2018, but that the time or the principle of it does still stand that a lot of the way that people are kept and Moved through the YouTube ecosystem is through this recommendation engine So that seems like a particularly as far as a demo proof of concept for parts of this project We're really useful for us to think about And where we started and this is a screenshot of an algebra queer a query for algebra With mice, I'm the actual profile that that it's using right now And if you look on the left you can see from it's quickly moving me from the query about algebra to various What I would say what we what we classified as out of scope Types of recommendations. This was a way for us to kind of balance and analyze the The algorithm But it's it's instructive because it kind of shows how quickly that recommendation engine moves from original query to different types of personalization I Am not interested in lumber prices, but there it was so So this case study really was about how would you begin to audit this experience? I mentioned to Mariela, Julian and Tyler before and Before I leave this slide The thing that we started with we wanted to try to use the API To kind of see how How recommendations were happening for particular for particular Videos and subject topics but we realized that that without the Simulation I'm gonna talk about this in a second as far as the the kind of processes you should follow if you want to start to do this Um those The YouTube Let's go back The pure research actually did their YouTube recommendation algorithm study Used only the API and what we realized is that there's real limitation there because without the kind of personalization and the web browsing the browser experience you don't You don't fall into these rabbit holes that you see Like like in this example where it's trying to pull you out of scope So the first thing we had to do was think through how do you simulate a user and so this was creating a user agent and Particularly moving to web scraping and harvesting not a bi Data mining Okay, so that's one of the first step and one of the goals of this Session is to just kind of introduce How you might start to do this and again, we're at the kind of the early cycles of this So step one is move away from even though those APIs are available, whether it's tiktok or Instagram Or YouTube you're not going to get the ability to understand actual algorithmic user experience or Record it more importantly Without using some kind of first simulating a user in a profile Step two is actually taking what you're seeing. So on that page scraping identifying the initial Video and then moving to understand and classify the recommendations on the left side of your or sorry right side of your screen right now And this was this was There were certain Signals or tells in the recommendations that we use to start to classify good Good recommendations or out of scope recommendations or bad recommendations so things like all caps In In the title or in within here if if a video is screaming at you there There were there were chances that the quality wasn't as good that it was necessarily Might be pulling you out of scope from your original kind of intentional query Things like extra late like we could watch for hyperlinks in the actual description because those were also those tended to In terms of quality those tended to be spammy your videos and just ways that we could grade the evaluate the actual The video itself and then anytime we moved like we were able to sort of have an original query intent Understand it and and kind of note when the subject was moving out of scope and then finally this regardless of Where parts of this Where the analysis ended up the the last thing that we realized was necessary was a way to kind of capture and Record what was there and so that that's step three for us and for anybody thinking routinely about this kind of work Was to package that data data set for reuse and we used The goal was because those of you who I know this is a data focus group they'll know they'll inherently notice I would imagine you would notice that Web scraping and harvesting can be very brittle right a front-end design can change pretty quickly and that That harvest of the title of the video is now in a different set of tags and your script isn't working Another reason to kind of pull that pull that together and make the recording of what the algorithmic user experience was like that particular date and time And we followed the Jumping in to give you a five-minute warning. We're gonna go a couple minutes over since we had some slow start. Okay. No worries. Thank you So we we followed some we wanted to look at the data set and data feeds as part of the schema But the one that really caught us and really helped us was this way to package a source of data research data with With the metadata with the metadata. This is the row cake row crate metadata specification And if you if you're not aware of it, it's a really useful tool you can kind of put a manifest together and embed even Procedures so it has like behavioral metadata like how you would run Analysis on a particular data set as long as well as the the data set itself Um, so stepping back When you think about this work and again, we're at early cycles and I'm ready for I think we're ready to think a little bit more about further fellowships or different kinds of ways of facilitating this kind of inquiry It's about finding that user agent and profile mapping out your harvesting and data points Collecting and analyzing the data and then encoding those results And there's lots lots of implications here If there's one thing I want us to start thinking or taking away is like, how do we how do we identify? and record algorithms as a form of a day of data as data sets and I think the the the the ghost that was left from that last bit the first part of the research was how do we move behind naming and defining and kind of identifying an Algorithm or algorithmic user experience to quantifying it Um a couple of notes, there's a code repository here the All of the work of the the research is there. There's the auditing tool or the very parts and parts of the script There's a small app that actually does some transparency work where you can actually run a run a search user experience and Look at what's happening behind the scenes and then there's curriculum syllabus and teaching resources as well Sorry you go ahead. That's okay. I just wanted to call out three people Lori Allen Thomas Padilla for collections as data if you're not aware of their work Sophia Noble algorithms and oppression was inspirational for us and then this group the markup which is a new Non-profit newsroom that's doing a lot of this work about how do you not only report but create data and analyze our technologies Thanks. Awesome. Thank you so much. That was a really fascinating. I Am we got maybe two minutes for questions and we have two questions in the chat So let's go ahead and chat about them The first Megan asks does algorithm equal user agent? In other words, are you using algorithms to measure an algorithm? That is so meta. I love it It's much like the view Yeah, um I wouldn't I guess I wouldn't say user agent is a Is an algorithm but Certainly the the work of I guess Creating that profile or of identifying that that profile is something that it feels Algorithmic suppose it's okay Megan. Yeah, that's good. No apologies So we have one more question from Nikki who asks Did you do anything like survival analysis with the recommendations ie survival time to the rabbit hole? No, that's that's a great you know parts of that analysis Really sort of it was very binary kind of like is this a good one? Is this a bad one? We just wanted to kind of get a start But I feel like there's so much more As far as the metrics in that Putting it into a panda's data frame because I work in Python so um and and and building out a couple new fields to kind of qualify the the the The value of like the analysis of the algorithm itself I'm kind of flashing back the web scraper had things like we pulled the duration Of the video, you know, just to see how long things were if that could be an indicator Um But I'm plenty plenty more work to be done. I'm excited to keep to keep kind of asking this question